Eric Lee / smarc-fsl-linux-kernel

10 Oct, 2013

10 commits

335fbe0f5 batman-adv: tvlv - convert tt query packet to use tvlv unicast packets ... Browse Code »

Instead of generating TT specific packets the TVLV unicast API is used
to send translation table data.

Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli

Marek Lindner
2013-10-10 03:22:30 +0800
e1bf0c140 batman-adv: tvlv - convert tt data sent within OGMs ... Browse Code »

The translation table meta data (version number, crc checksum, etc)
as well as the translation table diff propgated within OGMs now uses
the newly introduced tvlv infrastructure.

Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli

Marek Lindner
2013-10-10 03:22:29 +0800
3f4841ffb batman-adv: tvlv - add network coding container ... Browse Code »

Create network coding container to announce network coding
capabilities (if enabled).

Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli

Marek Lindner
2013-10-10 03:22:28 +0800
17cf0ea45 batman-adv: tvlv - add distributed arp table container ... Browse Code »

Create DAT container to announce DAT capabilities (if enabled).

Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli

Marek Lindner
2013-10-10 03:22:27 +0800
414254e34 batman-adv: tvlv - gateway download/upload bandwidth container ... Browse Code »

Prior to this patch batman-adv read the advertised uplink bandwidth
from userspace and compressed this information into a single byte
called "gateway class".
Now the download & upload bandwidth information is sent as-is. No
userspace change is necessary since the sysfs API always allowed
to specify a bandwidth.

Signed-off-by: Marek Lindner
Signed-off-by: Spyros Gasteratos
Signed-off-by: Antonio Quartulli

Marek Lindner
2013-10-10 03:22:27 +0800
ef2615774 batman-adv: tvlv - basic infrastructure ... Browse Code »

The goal is to provide the infrastructure for sending, receiving and
parsing information 'containers' while preserving backward
compatibility. TVLV (based on the commonly known Type Length Value
technique) was chosen as the format for those containers. Even if a
node does not know the tvlv type of a certain container it can simply
skip the current container and proceed with the next. Past experience
has shown features evolve over time, so a 'version' field was added
right from the start to allow differentiating between feature
variants - hence the name: T(ype) V(ersion) L(ength) V(alue).

This patch introduces the basic TVLV infrastructure:
* register / unregister tvlv containers to be sent with each OGM
(on primary interfaces only)
* register / unregister callback handlers to be called upon
finding the corresponding tvlv type in a tvlv buffer
* unicast tvlv send / receive API calls

Signed-off-by: Marek Lindner
Signed-off-by: Spyros Gasteratos
Signed-off-by: Antonio Quartulli

Marek Lindner
2013-10-10 03:22:26 +0800
60cf7981b batman-adv: switch to a new packet compatibility version ... Browse Code »

With this change batman-adv is breaking compatibility with
older versions and it is moving to compat-version 15.

Signed-off-by: Simon Wunderlich
Signed-off-by: Marek Lindner
Signed-off-by: Martin Hundebøll
Signed-off-by: Antonio Quartulli

Antonio Quartulli
2013-10-10 03:22:25 +0800
207df49e7 MAINTAINERS: batman-adv - update emails ... Browse Code »

Update my and Marek Lindner's email in the MAINTAINERS file

Signed-off-by: Antonio Quartulli

Antonio Quartulli
2013-10-10 03:22:25 +0800
3d7d562ca bnx2x: Add ndo_get_phys_port_id support ... Browse Code »

Each network interface (either PF or VF) is identified by its port's MAC id.

Signed-off-by: Yuval Mintz
Signed-off-by: Ariel Elior
Signed-off-by: Eilon Greenstein
Signed-off-by: David S. Miller

Yuval Mintz
2013-10-10 02:55:13 +0800
c2bb06db5 net: fix build errors if ipv6 is disabled ... Browse Code »

CONFIG_IPV6=n is still a valid choice ;)

It appears we can remove dead code.

Reported-by: Wu Fengguang
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2013-10-10 01:04:03 +0800

09 Oct, 2013

30 commits

f69b923a7 udp: fix a typo in __udp4_lib_mcast_demux_lookup ... Browse Code »

At this point sk might contain garbage.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2013-10-09 13:51:57 +0800
efe4208f4 ipv6: make lookups simpler and faster ... Browse Code »

TCP listener refactoring, part 4 :

To speed up inet lookups, we moved IPv4 addresses from inet to struct
sock_common

Now is time to do the same for IPv6, because it permits us to have fast
lookups for all kind of sockets, including upcoming SYN_RECV.

Getting IPv6 addresses in TCP lookups currently requires two extra cache
lines, plus a dereference (and memory stall).

inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6

This patch is way bigger than its IPv4 counter part, because for IPv4,
we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
it's not doable easily.

inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr

And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
at the same offset.

We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
macro.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2013-10-09 12:01:25 +0800
05dbc7b59 tcp/dccp: remove twchain ... Browse Code »

TCP listener refactoring, part 3 :

Our goal is to hash SYN_RECV sockets into main ehash for fast lookup,
and parallel SYN processing.

Current inet_ehash_bucket contains two chains, one for ESTABLISH (and
friend states) sockets, another for TIME_WAIT sockets only.

As the hash table is sized to get at most one socket per bucket, it
makes little sense to have separate twchain, as it makes the lookup
slightly more complicated, and doubles hash table memory usage.

If we make sure all socket types have the lookup keys at the same
offsets, we can use a generic and faster lookup. It turns out TIME_WAIT
and ESTABLISHED sockets already have common lookup fields for IPv4.

[ INET_TW_MATCH() is no longer needed ]

I'll provide a follow-up to factorize IPv6 lookup as well, to remove
INET6_TW_MATCH()

This way, SYN_RECV pseudo sockets will be supported the same.

A new sock_gen_put() helper is added, doing either a sock_put() or
inet_twsk_put() [ and will support SYN_RECV later ].

Note this helper should only be called in real slow path, when rcu
lookup found a socket that was moved to another identity (freed/reused
immediately), but could eventually be used in other contexts, like
sock_edemux()

Before patch :

dmesg | grep "TCP established"

TCP established hash table entries: 524288 (order: 11, 8388608 bytes)

After patch :

TCP established hash table entries: 524288 (order: 10, 4194304 bytes)

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2013-10-09 11:19:24 +0800
53af53ae8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
include/linux/netdevice.h
net/core/sock.c

Trivial merge issues.

Removal of "extern" for functions declaration in netdevice.h
at the same time "const" was added to an argument.

Two parallel line additions in net/core/sock.c

Signed-off-by: David S. Miller

David S. Miller
2013-10-09 11:07:53 +0800
9684d7b0d Merge branch 'sfc-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc ... Browse Code »

Ben Hutchings says:

====================
Some more fixes for EF10 support; hopefully the last lot:

1. Fixes for reading statistics, from Edward Cree and Jon Cooper.
2. Addition of ethtool statistics for packets dropped by the hardware
before they were associated with a specific function, from Edward Cree.
3. Only bind to functions that are in control of their associated port,
as the driver currently assumes this is the case.
====================

Signed-off-by: David S. Miller

David S. Miller
2013-10-09 09:56:09 +0800
7eec4174f pkt_sched: fq: fix non TCP flows pacing ... Browse Code »

Steinar reported FQ pacing was not working for UDP flows.

It looks like the initial sk->sk_pacing_rate value of 0 was
a wrong choice. We should init it to ~0U (unlimited)

Then, TCA_FQ_FLOW_DEFAULT_RATE should be removed because it makes
no real sense. The default rate is really unlimited, and we
need to avoid a zero divide.

Reported-by: Steinar H. Gunderson
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2013-10-09 09:54:01 +0800
b343ca84b Revert "veth: Showing peer of veth type dev in ip link (kernel side)" ... Browse Code »

This reverts commit 612c337306f00dc8d396830212de51c475844791.

As per Stephen Hemminger, the layout of the netlink attribute
is not implemented correctly so revert this for now.

Signed-off-by: David S. Miller

David S. Miller
2013-10-09 09:52:03 +0800
2b1f18a4d qlcnic: add missing destroy_workqueue() on error path in qlcnic_probe() ... Browse Code »

Add the missing destroy_workqueue() before return from
qlcnic_probe() in the error handling case.

Signed-off-by: Wei Yongjun
Signed-off-by: David S. Miller

Wei Yongjun
2013-10-09 04:33:50 +0800
bdfd6304c moxa: fix the error handling in moxart_mac_probe() ... Browse Code »

This patch fix the error handling in moxart_mac_probe():
- return -ENOMEM in some memory alloc fail cases
- add missing free_netdev() in the error handling case

Signed-off-by: Wei Yongjun
Signed-off-by: David S. Miller

Wei Yongjun
2013-10-09 04:33:50 +0800
c33a39c57 net: vlan: fix nlmsg size calculation in vlan_get_size() ... Browse Code »

This patch fixes the calculation of the nlmsg size, by adding the missing
nla_total_size().

Cc: Patrick McHardy
Signed-off-by: Marc Kleine-Budde
Signed-off-by: David S. Miller

Marc Kleine-Budde
2013-10-09 04:32:41 +0800
ede869cd0 pkt_sched: fq: fix typo for initial_quantum ... Browse Code »

TCA_FQ_INITIAL_QUANTUM should set q->initial_quantum

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2013-10-09 04:32:41 +0800
0e719e3a5 ipv6: Fix the upper MTU limit in GRE tunnel ... Browse Code »

Unlike ipv4, the struct member hlen holds the length of the GRE and ipv6
headers. This length is also counted in dev->hard_header_len.
Perhaps, it's more clean to modify the hlen to count only the GRE header
without ipv6 header as the variable name suggest, but the simple way to fix
this without regression risk is simply modify the calculation of the limit
in ip6gre_tunnel_change_mtu function.
Verified in kernel version v3.11.

Signed-off-by: Oussama Ghorbel
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Oussama Ghorbel
2013-10-09 04:32:40 +0800
ff0bfad6a cgroup: cls: remove unnecessary task_cls_classid ... Browse Code »

We can get classid through cgroup_subsys_state,
this is directviewing and effective.

Signed-off-by: Gao feng
Acked-by: Neil Horman
Signed-off-by: David S. Miller

Gao feng
2013-10-09 04:27:34 +0800
e1af5e445 cgroup: netprio: remove unnecessary task_netprioidx ... Browse Code »

Since the tasks have been migrated to the cgroup,
there is no need to call task_netprioidx to get
task's cgroup id.

Signed-off-by: Gao feng
Acked-by: Neil Horman
Signed-off-by: David S. Miller

Gao feng
2013-10-09 04:27:34 +0800
fbf8866d6 net: ipv4 only populate IP_PKTINFO when needed ... Browse Code »

The since the removal of the routing cache computing
fib_compute_spec_dst() does a fib_table lookup for each UDP multicast
packet received. This has introduced a performance regression for some
UDP workloads.

This change skips populating the packet info for sockets that do not have
IP_PKTINFO set.

Benchmark results from a netperf UDP_RR test:
Before 89789.68 transactions/s
After 90587.62 transactions/s

Benchmark results from a fio 1 byte UDP multicast pingpong test
(Multicast one way unicast response):
Before 12.63us RTT
After 12.48us RTT

Signed-off-by: Shawn Bohrer
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Shawn Bohrer
2013-10-09 04:27:33 +0800
421b3885b udp: ipv4: Add udp early demux ... Browse Code »

The removal of the routing cache introduced a performance regression for
some UDP workloads since a dst lookup must be done for each packet.
This change caches the dst per socket in a similar manner to what we do
for TCP by implementing early_demux.

For UDP multicast we can only cache the dst if there is only one
receiving socket on the host. Since caching only works when there is
one receiving socket we do the multicast socket lookup using RCU.

For UDP unicast we only demux sockets with an exact match in order to
not break forwarding setups. Additionally since the hash chains may be
long we only check the first socket to see if it is a match and not
waste extra time searching the whole chain when we might not find an
exact match.

Benchmark results from a netperf UDP_RR test:
Before 87961.22 transactions/s
After 89789.68 transactions/s

Benchmark results from a fio 1 byte UDP multicast pingpong test
(Multicast one way unicast response):
Before 12.97us RTT
After 12.63us RTT

Signed-off-by: Shawn Bohrer
Signed-off-by: David S. Miller

Shawn Bohrer
2013-10-09 04:27:33 +0800
005ec9743 udp: Only allow busy read/poll on connected sockets ... Browse Code »

UDP sockets can receive packets from multiple endpoints and thus may be
received on multiple receive queues. Since packets packets can arrive
on multiple receive queues we should not mark the napi_id for all
packets. This makes busy read/poll only work for connected UDP sockets.

This additionally enables busy read/poll for UDP multicast packets as
long as the socket is connected by moving the check into
__udp_queue_rcv_skb().

Signed-off-by: Shawn Bohrer
Suggested-by: Eric Dumazet
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Shawn Bohrer
2013-10-09 04:27:33 +0800
2c8c8e6f9 net_sched: increment drop counters in qdisc_tree_decrease_qlen() ... Browse Code »

qdisc_tree_decrease_qlen() is called when some packets are dropped
on a qdisc, and we want to notify parents of qlen changes.

We also can increment parents qdisc qstats drop counters.

This permits more accurate drop counters up to root qdisc.

For example a graft operation typically resets a qdisc
(drops all packets) and call qdisc_tree_decrease_qlen()

Note that callers are responsible for their drop counters.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2013-10-09 04:27:33 +0800
dc62ccacc xen-netback: transition to CLOSED when removing a VIF ... Browse Code »

If a guest is destroyed without transitioning its frontend to CLOSED,
the domain becomes a zombie as netback was not grant unmapping the
shared rings.

When removing a VIF, transition the backend to CLOSED so the VIF is
disconnected if necessary (which will unmap the shared rings etc).

This fixes a regression introduced by
279f438e36c0a70b23b86d2090aeec50155034a9 (xen-netback: Don't destroy
the netdev until the vif is shut down).

Signed-off-by: David Vrabel
Cc: Ian Campbell
Cc: Wei Liu
Cc: Paul Durrant
Acked-by: Wei Liu
Reviewed-by: Paul Durrant
Signed-off-by: David S. Miller

David Vrabel
2013-10-09 04:15:51 +0800
3c01352db Merge branch 'mlx4' ... Browse Code »

Amir Vadai says:

====================
net/mlx4_en: Fix pages never dma unmapped on rx

This patchset fixes a bug introduced by commit 51151a16 (mlx4: allow order-0
memory allocations in RX path). Where dma_unmap_page wasn't called.

Changes from V0:
- Added "Rename name of mlx4_en_rx_alloc members". Old names were confusing.
- Last frag in page calculation was wrong. Since all frags in page are of the
same size, need to add this frag_stride to end of frag offset, and not the
size of next frag in skb.
====================

Signed-off-by: David S. Miller

David S. Miller
2013-10-09 04:10:10 +0800
021f1107f net/mlx4_en: Fix pages never dma unmapped on rx ... Browse Code »

This patch fixes a bug introduced by commit 51151a16 (mlx4: allow
order-0 memory allocations in RX path).

dma_unmap_page never reached because condition to detect last fragment
in page is wrong. offset+frag_stride can't be greater than size, need to
make sure no additional frag will fit in page => compare offset +
frag_stride + next_frag_size instead.
next_frag_size is the same as the current one, since page is shared only
with frags of the same size.

CC: Eric Dumazet
Signed-off-by: Amir Vadai
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Amir Vadai
2013-10-09 04:09:50 +0800
70fbe0794 net/mlx4_en: Rename name of mlx4_en_rx_alloc members ... Browse Code »

Add page prefix to page related members: @size and @offset into
@page_size and @page_offset

CC: Eric Dumazet
Signed-off-by: Amir Vadai
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Amir Vadai
2013-10-09 04:09:50 +0800
4996b9098 bonding: ensure that TLB mode's active slave has correct mac filter ... Browse Code »

Currently, in TLB mode we change mac addresses only by memcpy-ing the to
net_device->dev_addr, without actually setting them via
dev_set_mac_address(). This permits us to receive all the traffic always on
one mac address.

However, in case the interface flips, some drivers might enforce the
mac filtering for its FW/HW based on current ->dev_addr, and thus we won't
be able to receive traffic on that interface, in case it will be selected
as active in TLB mode.

Fix it by setting the mac address forcefully on every new active slave that
we select in TLB mode.

CC: Jay Vosburgh
CC: Andy Gospodarek
CC: Yuval Mintz
Reported-by: Yuval Mintz
Tested-by: Yuval Mintz
Signed-off-by: Veaceslav Falico
Signed-off-by: David S. Miller

Veaceslav Falico
2013-10-09 04:06:39 +0800
2c6221e4a net: sh_eth: Fix RX packets errors on R8A7740 ... Browse Code »

This patch will fix RX packets errors when receiving big size
of data by set bit RNC = 1.

RNC - Receive Enable Control

0: Upon completion of reception of one frame, the E-DMAC writes
the receive status to the descriptor and clears the RR bit in
EDRRR to 0.

1: Upon completion of reception of one frame, the E-DMAC writes
(writes back) the receive status to the descriptor. In addition,
the E-DMAC reads the next descriptor and prepares for reception
of the next frame.

In addition, for get more stable when receiving packets, I set
maximum size for the transmit/receive FIFO and inserts padding
in receive data.

Signed-off-by: Nguyen Hong Ky
Signed-off-by: David S. Miller

Nguyen Hong Ky
2013-10-09 04:03:53 +0800
8d8a51e26 l2tp: Fix build warning with ipv6 disabled. ... Browse Code »

net/l2tp/l2tp_core.c: In function ‘l2tp_verify_udp_checksum’:
net/l2tp/l2tp_core.c:499:22: warning: unused variable ‘tunnel’ [-Wunused-variable]

Create a helper "l2tp_tunnel()" to facilitate this, and as a side
effect get rid of a bunch of unnecessary void pointer casts.

Signed-off-by: David S. Miller

David S. Miller
2013-10-09 03:44:26 +0800
5c0c52c91 tun: don't look at current when non-blocking ... Browse Code »

We play with a wait queue even if socket is
non blocking. This is an obvious waste.
Besides, it will prevent calling the non blocking
variant when current is not valid.

Signed-off-by: Michael S. Tsirkin
Acked-by: Jason Wang
Signed-off-by: David S. Miller

Michael S. Tsirkin
2013-10-09 03:38:35 +0800
66e358a84 Merge branch 'mrf24j40' ... Browse Code »

Alan Ott says:

====================
Fix race conditions in mrf24j40 interrupts

After testing with the betas of this patchset, it's been rebased and is
ready for inclusion.

David Hauweele noticed that the mrf24j40 would hang arbitrarily after some
period of heavy traffic. Two race conditions were discovered, and the
driver was changed to use threaded interrupts, since the enable/disable of
interrupts in the driver has recently been a lighning rod whenever issues
arise related to interrupts (costing engineering time), and since threaded
interrupts are the right way to do it.
====================

Signed-off-by: David S. Miller

David S. Miller
2013-10-09 03:32:19 +0800
40afbb657 mrf24j40: Use level-triggered interrupts ... Browse Code »

The mrf24j40 generates level interrupts. There are rare cases where it
appears that the interrupt line never gets de-asserted between interrupts,
causing interrupts to be lost, and causing a hung device from the driver's
perspective. Switching the driver to interpret these interrupts as
level-triggered fixes this issue.

Signed-off-by: Alan Ott
Signed-off-by: David S. Miller

Alan Ott
2013-10-09 03:32:14 +0800
4a4e1da83 mrf24j40: Use threaded IRQ handler ... Browse Code »

Eliminate all the workqueue and interrupt enable/disable.

Signed-off-by: Alan Ott
Signed-off-by: David S. Miller

Alan Ott
2013-10-09 03:32:13 +0800
9757f1d2e mrf24j40: Move INIT_COMPLETION() to before packet transmission ... Browse Code »

This avoids a race condition where complete(tx_complete) could be called
before tx_complete is initialized.

Signed-off-by: Alan Ott
Signed-off-by: David S. Miller

Alan Ott
2013-10-09 03:32:13 +0800