Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

11 Sep, 2012

1 commit

15e473046 netlink: Rename pid to portid to avoid confusion ... Browse Code »

It is a frequent mistake to confuse the netlink port identifier with a
process identifier. Try to reduce this confusion by renaming fields
that hold port identifiers portid instead of pid.

I have carefully avoided changing the structures exported to
userspace to avoid changing the userspace API.

I have successfully built an allyesconfig kernel with this change.

Signed-off-by: "Eric W. Biederman"
Acked-by: Stephen Hemminger
Signed-off-by: David S. Miller

Eric W. Biederman
2012-09-11 03:30:41 +0800

01 Sep, 2012

1 commit

c32f38619 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Merge the 'net' tree to get the recent set of netfilter bug fixes in
order to assist with some merge hassles Pablo is going to have to deal
with for upcoming changes.

Signed-off-by: David S. Miller

David S. Miller
2012-09-01 03:14:18 +0800

25 Aug, 2012

1 commit

e6acb3848 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

This is an initial merge in of Eric Biederman's work to start adding
user namespace support to the networking.

Signed-off-by: David S. Miller

David S. Miller
2012-08-25 06:54:37 +0800

24 Aug, 2012

1 commit

a0dfb2634 af_packet: match_fanout_group() can be static ... Browse Code »

cc: Eric Leblond
Signed-off-by: Fengguang Wu
Signed-off-by: David S. Miller

Fengguang Wu
2012-08-24 00:27:12 +0800

23 Aug, 2012

3 commits

0fa7fa98d packet: Protect packet sk list with mutex (v2) ... Browse Code »

Change since v1:

* Fixed inuse counters access spotted by Eric

In patch eea68e2f (packet: Report socket mclist info via diag module) I've
introduced a "scheduling in atomic" problem in packet diag module -- the
socket list is traversed under rcu_read_lock() while performed under it sk
mclist access requires rtnl lock (i.e. -- mutex) to be taken.

[152363.820563] BUG: scheduling while atomic: crtools/12517/0x10000002
[152363.820573] 4 locks held by crtools/12517:
[152363.820581] #0: (sock_diag_mutex){+.+.+.}, at: [] sock_diag_rcv+0x1f/0x3e
[152363.820613] #1: (sock_diag_table_mutex){+.+.+.}, at: [] sock_diag_rcv_msg+0xdb/0x11a
[152363.820644] #2: (nlk->cb_mutex){+.+.+.}, at: [] netlink_dump+0x23/0x1ab
[152363.820693] #3: (rcu_read_lock){.+.+..}, at: [] packet_diag_dump+0x0/0x1af

Similar thing was then re-introduced by further packet diag patches (fanount
mutex and pgvec mutex for rings) :(

Apart from being terribly sorry for the above, I propose to change the packet
sk list protection from spinlock to mutex. This lock currently protects two
modifications:

* sklist
* prot inuse counters

The sklist modifications can be just reprotected with mutex since they already
occur in a sleeping context. The inuse counters modifications are trickier -- the
__this_cpu_-s are used inside, thus requiring the caller to handle the potential
issues with contexts himself. Since packet sockets' counters are modified in two
places only (packet_create and packet_release) we only need to protect the context
from being preempted. BH disabling is not required in this case.

Signed-off-by: Pavel Emelyanov
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Pavel Emelyanov
2012-08-23 13:58:27 +0800
9e67030af af_packet: use define instead of constant ... Browse Code »

Instead of using a hard-coded value for the status variable, it would make
the code more readable to use its destined define from linux/if_packet.h.

Signed-off-by: daniel.borkmann@tik.ee.ethz.ch
Signed-off-by: David S. Miller

danborkmann@iogearbox.net
2012-08-23 13:58:27 +0800
1304a7343 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2012-08-23 05:21:38 +0800

20 Aug, 2012

3 commits

c0de08d04 af_packet: don't emit packet on orig fanout group ... Browse Code »
5

If a packet is emitted on one socket in one group of fanout sockets,
it is transmitted again. It is thus read again on one of the sockets
of the fanout group. This result in a loop for software which
generate packets when receiving one.
This retransmission is not the intended behavior: a fanout group
must behave like a single socket. The packet should not be
transmitted on a socket if it originates from a socket belonging
to the same fanout group.

This patch fixes the issue by changing the transmission check to
take fanout group info account.

Reported-by: Aleksandr Kotov
Signed-off-by: Eric Leblond
Signed-off-by: David S. Miller

Eric Leblond
2012-08-20 17:37:29 +0800
fff3321d7 packet: Report fanout status via diag engine ... Browse Code »

Reported value is the same reported by the FANOUT getsockoption, but
unlike it, the absent fanout setup results in absent nlattr, rather
than in nlattr with zero value. This is done so, since zero fanout
report may mean both -- no fanout, and fanout with both id and type zero.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2012-08-20 17:23:14 +0800
16f01365f packet: Report rings cfg via diag engine ... Browse Code »

One extension bit may result in two nlattrs -- one per ring type.
If some ring type is not configured, then the respective nlatts
will be empty.

The structure reported contains the data, that is given to the
corresponding ring setup socket option.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2012-08-20 17:23:14 +0800

15 Aug, 2012

5 commits

a7cb5a49b userns: Print out socket uids in a user namespace aware fashion. ... Browse Code »

Cc: Alexey Kuznetsov
Cc: James Morris
Cc: Hideaki YOSHIFUJI
Cc: Patrick McHardy
Cc: Arnaldo Carvalho de Melo
Cc: Sridhar Samudrala
Acked-by: Vlad Yasevich
Acked-by: David S. Miller
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2012-08-15 12:48:06 +0800
eea68e2f1 packet: Report socket mclist info via diag module ... Browse Code »
5

The info is reported as an array of packet_diag_mclist structures. Each
includes not only the directly configured values (index, type, etc), but
also the "count".

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2012-08-15 07:56:33 +0800
8a360be0c packet: Report more packet sk info via diag module ... Browse Code »

This reports in one rtattr message all the other scalar values, that can be
set on a packet socket with setsockopt.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2012-08-15 07:56:33 +0800
96ec63271 packet: Diag core and basic socket info dumping ... Browse Code »

The diag module can be built independently from the af_packet.ko one,
just like it's done in unix sockets.

The core dumping message carries the info available at socket creation
time, i.e. family, type and protocol (in the same byte order as shown in
the proc file).

The socket inode number and cookie is reserved for future per-socket info
retrieving. The per-protocol filtering is also reserved for future by
requiring the sdiag_protocol to be zero.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2012-08-15 07:56:33 +0800
2787b04b6 packet: Introduce net/packet/internal.h header ... Browse Code »

The diag module will need to access some private packet_sock data, so
move it to a header in advance. This file will be shared between the
af_packet.c and the diag.c

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2012-08-15 07:56:33 +0800

13 Aug, 2012

1 commit

7f5c3e3a8 af_packet: remove BUG statement in tpacket_destruct_skb ... Browse Code »

Here's a quote of the comment about the BUG macro from asm-generic/bug.h:

Don't use BUG() or BUG_ON() unless there's really no way out; one
example might be detecting data structure corruption in the middle
of an operation that can't be backed out of. If the (sub)system
can somehow continue operating, perhaps with reduced functionality,
it's probably not BUG-worthy.

If you're tempted to BUG(), think again: is completely giving up
really the *only* solution? There are usually better options, where
users don't need to reboot ASAP and can mostly shut down cleanly.

In our case, the status flag of a ring buffer slot is managed from both sides,
the kernel space and the user space. This means that even though the kernel
side might work as expected, the user space screws up and changes this flag
right between the send(2) is triggered when the flag is changed to
TP_STATUS_SENDING and a given skb is destructed after some time. Then, this
will hit the BUG macro. As David suggested, the best solution is to simply
remove this statement since it cannot be used for kernel side internal
consistency checks. I've tested it and the system still behaves /stable/ in
this case, so in accordance with the above comment, we should rather remove it.

Signed-off-by: Daniel Borkmann
Signed-off-by: David S. Miller

danborkmann@iogearbox.net
2012-08-13 04:42:17 +0800

09 Aug, 2012

1 commit

99aa3473e af_packet: Quiet sparse noise about using plain integer as NULL pointer ... Browse Code »

Quiets the sparse warning:
warning: Using plain integer as NULL pointer

Signed-off-by: Ying Xue
Signed-off-by: David S. Miller

Ying Xue
2012-08-09 06:43:22 +0800

28 Jun, 2012

1 commit

e440cf2ca net: added support for 40GbE link. ... Browse Code »

1. removed code replication for tov calculation for 1G, 10G and
made is common for speed > 1G (1G, 10G, 40G, 100G).
2. defines values for #4 different 40G Phys (KR4, LF4, SR4, CR4)

Signed-off-by: Parav Pandit
Reviewed-by: Ben Hutchings
Signed-off-by: David S. Miller

parav.pandit@emulex.com
2012-06-28 06:42:24 +0800

12 Jun, 2012

1 commit

de74e92aa af_packet: use sizeof instead of constant in spkt_device ... Browse Code »

This small patch removes access to the last element of the spkt_device
array through a constant. Instead, it is accessed by sizeof() to respect
possible changes in if_packet.h.

Signed-off-by: Daniel Borkmann
Signed-off-by: David S. Miller

danborkmann@iogearbox.net
2012-06-12 07:51:51 +0800

04 Jun, 2012

1 commit

e3192690a net: Remove casts to same type ... Browse Code »

Adding casts of objects to the same type is unnecessary
and confusing for a human reader.

For example, this cast:

int y;
int *p = (int *)&y;

I used the coccinelle script below to find and remove these
unnecessary casts. I manually removed the conversions this
script produces of casts with __force and __user.

@@
type T;
T *p;
@@

- (T *)p
+ p

Signed-off-by: Joe Perches
Signed-off-by: David S. Miller

Joe Perches
2012-06-04 23:45:11 +0800

22 Apr, 2012

1 commit

c06fff6e1 af_packet: packet_getsockopt() cleanup ... Browse Code »

Factorize code, since most fetched values are int type.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2012-04-22 04:36:42 +0800

20 Apr, 2012

1 commit

abc4e4fa2 packet: dont drop packet but consume it ... Browse Code »

When we need to clone skb, we dont drop a packet.
Call consume_skb() to not confuse dropwatch.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2012-04-20 02:23:55 +0800

16 Apr, 2012

1 commit

95c961747 net: cleanup unsigned to unsigned int ... Browse Code »

Use of "unsigned int" is preferred to bare "unsigned" in net tree.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2012-04-16 00:44:40 +0800

29 Mar, 2012

1 commit

9ffc93f20 Remove all #inclusions of asm/system.h ... Browse Code »

Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:

perl -p -i -e 's!^#\s*include\s*.*\n!!' `grep -Irl '^#\s*include\s*' *`

Signed-off-by: David Howells

David Howells
2012-03-29 01:30:03 +0800

24 Feb, 2012

1 commit

3bdc0eba0 net: Add framework to allow sending packets with customized CRC. ... Browse Code »

This is useful for testing RX handling of frames with bad
CRCs.

Requires driver support to actually put the packet on the
wire properly.

Signed-off-by: Ben Greear
Tested-by: Aaron Brown
Signed-off-by: Jeff Kirsher

Ben Greear
2012-02-24 17:37:35 +0800

31 Dec, 2011

1 commit

7f8e3234c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2011-12-31 02:04:14 +0800

28 Dec, 2011

1 commit

aef950b4b packet: fix possible dev refcnt leak when bind fail ... Browse Code »

If bind is fail when bind is called after set PACKET_FANOUT
sock option, the dev refcnt will leak.

Signed-off-by: Wei Yongjun
Signed-off-by: David S. Miller

Wei Yongjun
2011-12-28 11:32:41 +0800

24 Dec, 2011

1 commit

abb434cb0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
net/bluetooth/l2cap_core.c

Just two overlapping changes, one added an initialization of
a local variable, and another change added a new local variable.

Signed-off-by: David S. Miller

David S. Miller
2011-12-24 06:13:56 +0800

23 Dec, 2011

1 commit

0fd7bac6b net: relax rcvbuf limits ... Browse Code »
13

skb->truesize might be big even for a small packet.

Its even bigger after commit 87fb4b7b533 (net: more accurate skb
truesize) and big MTU.

We should allow queueing at least one packet per receiver, even with a
low RCVBUF setting.

Reported-by: Michal Simek
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-12-23 15:15:14 +0800

19 Nov, 2011

2 commits

4ce409125 packet: Add needed_tailroom to packet_sendmsg_spkt ... Browse Code »

packet: Add needed_tailroom to packet_sendmsg_spkt

While auditing LL_ALLOCATED_SPACE I noticed that packet_sendmsg_spkt
did not include needed_tailroom when allocating an skb. This isn't
a fatal error as we should always tolerate inadequate tail room but
it isn't optimal.

This patch fixes that.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2011-11-19 03:37:10 +0800
ae641949d net: Remove all uses of LL_ALLOCATED_SPACE ... Browse Code »

net: Remove all uses of LL_ALLOCATED_SPACE

The macro LL_ALLOCATED_SPACE was ill-conceived. It applies the
alignment to the sum of needed_headroom and needed_tailroom. As
the amount that is then reserved for head room is needed_headroom
with alignment, this means that the tail room left may be too small.

This patch replaces all uses of LL_ALLOCATED_SPACE with the macro
LL_RESERVED_SPACE and direct reference to needed_tailroom.

This also fixes the problem with needed_headroom changing between
allocating the skb and reserving the head room.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2011-11-19 03:37:09 +0800

04 Nov, 2011

1 commit

eea49cc90 af_packet: de-inline some helper functions ... Browse Code »

This popped some compiler errors due to mismatched prototypes. Just
remove most manual inlines, the compiler should be able to figure out
what makes sense to inline and not.

net/packet/af_packet.c:252: warning: 'prb_curr_blk_in_use' declared inline after being called
net/packet/af_packet.c:252: warning: previous declaration of 'prb_curr_blk_in_use' was here
net/packet/af_packet.c:258: warning: 'prb_queue_frozen' declared inline after being called
net/packet/af_packet.c:258: warning: previous declaration of 'prb_queue_frozen' was here
net/packet/af_packet.c:248: warning: 'packet_previous_frame' declared inline after being called
net/packet/af_packet.c:248: warning: previous declaration of 'packet_previous_frame' was here
net/packet/af_packet.c:251: warning: 'packet_increment_head' declared inline after being called
net/packet/af_packet.c:251: warning: previous declaration of 'packet_increment_head' was here

Signed-off-by: Olof Johansson
Cc: Chetan Loke
Signed-off-by: David S. Miller

Olof Johansson
2011-11-04 06:11:51 +0800

19 Oct, 2011

1 commit

bc416d976 macvlan: handle fragmented multicast frames ... Browse Code »

Fragmented multicast frames are delivered to a single macvlan port,
because ip defrag logic considers other samples are redundant.

Implement a defrag step before trying to send the multicast frame.

Reported-by: Ben Greear
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-10-19 11:22:07 +0800

11 Oct, 2011

1 commit

95f5f803b af_packet: remove unnecessary BUG_ON() in tpacket_destruct_skb ... Browse Code »

If skb is NULL, then stack trace is thrown anyway on dereference.
Therefore, the stack trace triggered by BUG_ON is duplicate.

Signed-off-by: Daniel Borkmann
Cc: Eric Dumazet
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

danborkmann@iogearbox.net
2011-10-11 02:09:08 +0800

08 Oct, 2011

1 commit

88c5100c2 Merge branch 'master' of github.com:davem330/net ... Browse Code »

Conflicts:
net/batman-adv/soft-interface.c

David S. Miller
2011-10-08 01:38:43 +0800

04 Oct, 2011

1 commit

7091fbd82 make PACKET_STATISTICS getsockopt report consistently between ring and non-ring ... Browse Code »

This is a minor change.

Up until kernel 2.6.32, getsockopt(fd, SOL_PACKET, PACKET_STATISTICS,
...) would return total and dropped packets since its last invocation. The
introduction of socket queue overflow reporting [1] changed drop
rate calculation in the normal packet socket path, but not when using a
packet ring. As a result, the getsockopt now returns different statistics
depending on the reception method used. With a ring, it still returns the
count since the last call, as counts are incremented in tpacket_rcv and
reset in getsockopt. Without a ring, it returns 0 if no drops occurred
since the last getsockopt and the total drops over the lifespan of
the socket otherwise. The culprit is this line in packet_rcv, executed
on a drop:

drop_n_acct:
po->stats.tp_drops = atomic_inc_return(&sk->sk_drops);

As it shows, the new drop number it taken from the socket drop counter,
which is not reset at getsockopt. I put together a small example
that demonstrates the issue [2]. It runs for 10 seconds and overflows
the queue/ring on every odd second. The reported drop rates are:
ring: 16, 0, 16, 0, 16, ...
non-ring: 0, 15, 0, 30, 0, 46, 0, 60, 0 , 74.

Note how the even ring counts monotonically increase. Because the
getsockopt adds tp_drops to tp_packets, total counts are similarly
reported cumulatively. Long story short, reinstating the original code, as
the below patch does, fixes the issue at the cost of additional per-packet
cycles. Another solution that does not introduce per-packet overhead
is be to keep the current data path, record the value of sk_drops at
getsockopt() at call N in a new field in struct packetsock and subtract
that when reporting at call N+1. I'll be happy to code that, instead,
it's just more messy.

[1] http://patchwork.ozlabs.org/patch/35665/
[2] http://kernel.googlecode.com/files/test-packetsock-getstatistics.c

Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller

Willem de Bruijn
2011-10-04 02:18:26 +0800

16 Sep, 2011

1 commit

4bc71cb98 net: consolidate and fix ethtool_ops->get_settings calling ... Browse Code »

This patch does several things:
- introduces __ethtool_get_settings which is called from ethtool code and
from drivers as well. Put ASSERT_RTNL there.
- dev_ethtool_get_settings() is replaced by __ethtool_get_settings()
- changes calling in drivers so rtnl locking is respected. In
iboe_get_rate was previously ->get_settings() called unlocked. This
fixes it. Also prb_calc_retire_blk_tmo() in af_packet.c had the same
problem. Also fixed by calling __dev_get_by_index() instead of
dev_get_by_index() and holding rtnl_lock for both calls.
- introduces rtnl_lock in bnx2fc_vport_create() and fcoe_vport_create()
so bnx2fc_if_create() and fcoe_if_create() are called locked as they
are from other places.
- use __ethtool_get_settings() in bonding code

Signed-off-by: Jiri Pirko

v2->v3:
-removed dev_ethtool_get_settings()
-added ASSERT_RTNL into __ethtool_get_settings()
-prb_calc_retire_blk_tmo - use __dev_get_by_index() and lock
around it and __ethtool_get_settings() call
v1->v2:
add missing export_symbol
Reviewed-by: Ben Hutchings [except FCoE bits]
Acked-by: Ralf Baechle
Signed-off-by: David S. Miller

Jiri Pirko
2011-09-16 05:32:26 +0800

27 Aug, 2011

1 commit

bc59ba399 af_packet: Prefixed tpacket_v3 structs to avoid name space collision ... Browse Code »

structs introduced in tpacket_v3 implementation are prefixed with 'tpacket'
to avoid namespace collision.

Compile tested.

Signed-off-by: Chetan Loke
Signed-off-by: David S. Miller

chetan loke
2011-08-27 00:38:44 +0800

25 Aug, 2011

1 commit

f6fb8f100 af-packet: TPACKET_V3 flexible buffer implementation. ... Browse Code »
22

1) Blocks can be configured with non-static frame-size.
2) Read/poll is at a block-level(as opposed to packet-level).
3) Added poll timeout to avoid indefinite user-space wait on idle links.
4) Added user-configurable knobs:
4.1) block::timeout.
4.2) tpkt_hdr::sk_rxhash.

Changes:
C1) tpacket_rcv()
C1.1) packet_current_frame() is replaced by packet_current_rx_frame()
The bulk of the processing is then moved in the following chain:
packet_current_rx_frame()
__packet_lookup_frame_in_block
fill_curr_block()
or
retire_current_block
dispatch_next_block
or
return NULL(queue is plugged/paused)

Signed-off-by: Chetan Loke
Signed-off-by: David S. Miller

chetan loke
2011-08-25 10:40:40 +0800

14 Jul, 2011

1 commit

cc9f01b24 af-packet: fix - avoid reading stale data ... Browse Code »

Currently we flush tp_status and then flush the remainder of the header+payload.
tp_status should be flushed in the end to avoid stale data being read by user-space.

Incorrectly re-ordered barriers in v1.

Signed-off-by: Chetan Loke
Signed-off-by: David S. Miller

Chetan Loke
2011-07-14 23:36:33 +0800