Eric Lee / smarc-fsl-linux-kernel

12 Jul, 2013

1 commit

d77e41e12 net/tipc: use %*phC to dump small buffers in hex form ... Browse Code »

Instead of passing each byte by stack let's use nice specifier for that.

Signed-off-by: Andy Shevchenko
Signed-off-by: David S. Miller

Andy Shevchenko
2013-07-12 08:03:36 +0800

18 Jun, 2013

15 commits

2537af9dc tipc: remove dev_base_lock use from enable_bearer ... Browse Code »

Convert enable_bearer() to RCU locking with dev_get_by_name().

Based on a similar changeset in commit 840a185d ["aoe: remove
dev_base_lock use from aoecmd_cfg_pkts()"] -- quoting that:

"dev_base_lock is the legacy way to lock the device list,
and is planned to disappear. (writers hold RTNL, readers
hold RCU lock)"

Signed-off-by: Ying Xue
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:01 +0800
126c05246 tipc: fix wrong return value for link_send_sections_long routine ... Browse Code »

When skb buffer cannot be allocated in link_send_sections_long(),
-ENOMEM error code instead of -EFAULT should be returned to its
caller.

Signed-off-by: Ying Xue
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:01 +0800
7410f967b tipc: make tipc_link_send_sections_fast exit earlier ... Browse Code »

Once message build request function returns invalid code, the
process of sending message cannot continue. So in case of message
build failure, tipc_link_send_sections_fast() should return
immediately.

Signed-off-by: Ying Xue
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:01 +0800
796c75d0d tipc: enhance priority of link protocol packet ... Browse Code »

pfifo_fast is set as default traffic class queueing discipline. This
queue has three so called "bands". Within each band, FIFO rules apply.
However, as long as there are packets waiting in band 0, band 1 won't
be processed.

Now all kind of TIPC type packet priorities are never set, that is,
their priorities are 0, so they are mapped to band 1 of pfifo_fast
qdisc. But, especially during link congestion, if link protocol packet
can be sent out as earlier as possible than other type of packets so
that protocol packet can arrive at peer endpoint in time, the peer
will timely reset its link timeout timer to keep the link alive.
So enhancing the priority of link protocol packets can meet the
specific demand to avoid unnecessary link reset due to a transient
link congestion.

Signed-off-by: Ying Xue
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:01 +0800
ae8509c42 tipc: cosmetic realignment of function arguments ... Browse Code »

No runtime code changes here. Just a realign of the function
arguments to start where the 1st one was, and fit as many args
as can be put in an 80 char line.

Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Paul Gortmaker
2013-06-18 06:53:01 +0800
c0fee8aca tipc: save sock structure pointer instead of void pointer to tipc_port ... Browse Code »

Directly save sock structure pointer instead of void pointer to avoid
unnecessary cast conversions.

Signed-off-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:01 +0800
28e529728 tipc: convert config_lock from spinlock to mutex ... Browse Code »

As the configuration server is now running under process context,
it's unnecessary for us to have a spinlock serializing the TIPC
configuration process. Instead, we replace it with a mutex lock,
which gives us more freedom. For instance, we can now call
pre-emptable functions within the protected area.

Signed-off-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:01 +0800
3c5db8e4e tipc: rename tipc_createport_raw to tipc_createport ... Browse Code »

After the removal of the native API, there is now only one way to
to create a TIPC port instance -- the function tipc_createport_raw().
We make it more readable by renaming it to tipc_createport().

Signed-off-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:01 +0800
f1733d758 tipc: remove user_port instance from tipc_port structure ... Browse Code »

After the native API has been completely removed, the 'user_port'
field in struct tipc_port becomes unused, and can be removed.
As a consequence, the "usrmem" argument in tipc_msg_build() is no
longer needed, and so we remove that one too.

Signed-off-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:00 +0800
198d73b82 tipc: delete code orphaned by new server infrastructure ... Browse Code »

Having completed the conversion of the topology server and
configuration server to use the new server infrastructure,
the following functions become unused, and can be deleted:

- tipc_createport()
- port_wakeup_sh()
- port_dispatcher()
- port_dispatcher_sigh()
- tipc_send_buf_fast()
- tipc_send_buf2port

Additionally, the following variables become orphaned,
and can be deleted:

- tipc_msg_err_event
- tipc_named_msg_err_event
- tipc_conn_shutdown_event
- tipc_msg_event
- tipc_named_msg_event
- tipc_conn_msg_event
- tipc_continue_event
- msg_queue_head
- msg_queue_tail
- queue_lock

Deletion is done here in a separate commit in order to allow
the actual conversion changes to be more easily viewed.

Signed-off-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:00 +0800
7d0ab17b7 tipc: convert configuration server to use new server facility ... Browse Code »

As the new socket-based TIPC server infrastructure has been
introduced, we can now convert the configuration server to use
it. Then we can take future steps to simplify the configuration
server locking policy.

Some minor reordering of initialization is done, due to the
dependency on having tipc_socket_init completed.

Signed-off-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:00 +0800
13a2e8987 tipc: convert topology server to use new server facility ... Browse Code »

As the new TIPC server infrastructure has been introduced, we can
now convert the TIPC topology server to it. We get two benefits
from doing this:

1) It simplifies the topology server locking policy. In the
original locking policy, we placed one spin lock pointer in the
tipc_subscriber structure to reuse the lock of the subscriber's
server port, controlling access to members of tipc_subscriber
instance. That is, we only used one lock to ensure both
tipc_port and tipc_subscriber members were safely accessed.

Now we introduce another spin lock for tipc_subscriber structure
only protecting themselves, to get a finer granularity locking
policy. Moreover, the change will allow us to make the topology
server code more readable and maintainable.

2) It fixes a bug where sent subscription events may be lost when
the topology port is congested. Using the new service, the
topology server now queues sent events into an outgoing buffer,
and then wakes up a sender process which has been blocked in
workqueue context. The process will keep picking events from the
buffer and send them to their respective subscribers, using the
kernel socket interface, until the buffer is empty. Even if the
socket is congested during transmission there is no risk that
events may be dropped, since the sender process may block when
needed.

Some minor reordering of initialization is done, since we now
have a scenario where the topology server must be started after
socket initialization has taken place, as the former depends
on the latter. And overall, we see a simplification of the
TIPC subscriber code in making this changeover.

Signed-off-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:00 +0800
c5fa7b3cf tipc: introduce new TIPC server infrastructure ... Browse Code »

TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.

As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.

To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.

The new service also solves another problem:

As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.

Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.

As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.

As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.

Signed-off-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:00 +0800
5d21cb70d tipc: allow implicit connect for stream sockets ... Browse Code »

TIPC's implied connect feature, aka piggyback connect, allows
applications to save one syscall and all SYN/SYN-ACK signalling
overhead when setting up a connection. Until now, this has only
been supported for SEQPACKET sockets. Here, we make it possible
to use this feature even with stream sockets.

At the connecting side, the connection is completed when the
first data message arrives from the accepting peer. This means
that we must allow the connecting user to call blocking recv()
before the socket has reached state SS_CONNECTED. So we must must
relax the state machine check at recv_stream(), and allow the
recv() call even if socket is in state SS_CONNECTING.

Signed-off-by: Erik Hugne
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Erik Hugne
2013-06-18 06:53:00 +0800
cc79dd1ba tipc: change socket buffer overflow control to respect sk_rcvbuf ... Browse Code »

As per feedback from the netdev community, we change the buffer
overflow protection algorithm in receiving sockets so that it
always respects the nominal upper limit set in sk_rcvbuf.

Instead of scaling up from a small sk_rcvbuf value, which leads to
violation of the configured sk_rcvbuf limit, we now calculate the
weighted per-message limit by scaling down from a much bigger value,
still in the same field, according to the importance priority of the
received message.

To allow for administrative tunability of the socket receive buffer
size, we create a tipc_rmem sysctl variable to allow the user to
configure an even bigger value via sysctl command. It is a size of
three (min/default/max) to be consistent with things like tcp_rmem.

By default, the value initialized in tipc_rmem[1] is equal to the
receive socket size needed by a TIPC_CRITICAL_IMPORTANCE message.
This value is also set as the default value of sk_rcvbuf.

Originally-by: Jon Maloy
Cc: Neil Horman
Cc: Jon Maloy
[Ying: added sysctl variation to Jon's original patch]
Signed-off-by: Ying Xue
[PG: don't compile sysctl.c if not config'd; add Documentation]
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Ying Xue
2013-06-18 06:53:00 +0800

29 May, 2013

1 commit

351638e7d net: pass info struct via netdevice notifier ... Browse Code »

So far, only net_device * could be passed along with netdevice notifier
event. This patch provides a possibility to pass custom structure
able to provide info that event listener needs to know.

Signed-off-by: Jiri Pirko

v2->v3: fix typo on simeth
shortened dev_getter
shortened notifier_info struct name
v1->v2: fix notifier_call parameter in call_netdevice_notifier()
Signed-off-by: David S. Miller

Jiri Pirko
2013-05-29 04:11:01 +0800

07 May, 2013

2 commits

6bf15191f tipc: potential divide by zero in tipc_link_recv_fragment() ... Browse Code »

The worry here is that fragm_sz could be zero since it comes from
skb->data.

Signed-off-by: Dan Carpenter
Signed-off-by: David S. Miller

Dan Carpenter
2013-05-07 04:16:52 +0800
cb4b102f0 tipc: add a bounds check in link_recv_changeover_msg() ... Browse Code »

The bearer_id here comes from skb->data and it can be a number from 0 to
7. The problem is that the ->links[] array has only 2 elements so I
have added a range check.

Signed-off-by: Dan Carpenter
Signed-off-by: David S. Miller

Dan Carpenter
2013-05-07 04:16:52 +0800

04 May, 2013

3 commits

488fc9af8 tipc: pskb_copy() buffers when sending on more than one bearer ... Browse Code »

When sending packets, TIPC bearers use skb_clone() before writing their
hardware header. This will however NOT copy the data buffer.
So when the same packet is sent over multiple bearers (to reach multiple
nodes), the same socket buffer data will be treated by multiple
tipc_media drivers which will write their own hardware header through
dev_hard_header().
Most of the time this is not a problem, because by the time the
packet is processed by the second media, it has already been sent over
the first one. However, when the first transmission is delayed (e.g.
because of insufficient bandwidth or through a shaper), the next bearer
will overwrite the hardware header, resulting in the packet being sent:
a) with the wrong source address, when bearers of the same type,
e.g. ethernet, are involved
b) with a completely corrupt header, or even dropped, when bearers of
different types are involved.

So when the same socket buffer is to be sent multiple times, send a
pskb_copy() instead (from the second instance on), and release it
afterwards (the bearer will skb_clone() it anyway).

Signed-off-by: Gerlando Falauto
Signed-off-by: David S. Miller

Gerlando Falauto
2013-05-04 04:08:58 +0800
77861d9c0 tipc: tipc_bcbearer_send(): simplify bearer selection ... Browse Code »

Signed-off-by: Gerlando Falauto
Signed-off-by: David S. Miller

Gerlando Falauto
2013-05-04 04:08:58 +0800
e61607109 tipc: cosmetic: clean up comments and break a long line ... Browse Code »

Signed-off-by: Gerlando Falauto
Signed-off-by: David S. Miller

Gerlando Falauto
2013-05-04 04:08:58 +0800

18 Apr, 2013

4 commits

a29a194a1 tipc: add InfiniBand media type ... Browse Code »

Add InfiniBand media type based on the ethernet media type.

The only real difference is that in case of InfiniBand, we need the entire
20 bytes of space reserved for media addresses, so the TIPC media type ID is
not explicitly stored in the packet payload.

Sample output of tipc-config:

# tipc-config -v -addr -netid -nt=all -p -m -b -n -ls

node address:
current network id: 4711
Type Lower Upper Port Identity Publication Scope
0 167776257 167776257 1855512578 cluster
167776260 167776260 1216454658 zone
1 1 1 1216479236 node
Ports:
1216479235: bound to {1,1}
1216454657: bound to {0,167776260}
Media:
eth
ib
Bearers:
ib:ib0
Nodes known:
: up
Link
Window:20 packets
RX packets:0 fragments:0/0 bundles:0/0
TX packets:0 fragments:0/0 bundles:0/0
RX naks:0 defs:0 dups:0
TX naks:0 acks:0 dups:0
Congestion bearer:0 link:0 Send queue max:0 avg:0

Link
ACTIVE MTU:2044 Priority:10 Tolerance:1500 ms Window:50 packets
RX packets:80 fragments:0/0 bundles:0/0
TX packets:40 fragments:0/0 bundles:0/0
TX profile sample:22 packets average:54 octets
0-64:100% -256:0% -1024:0% -4096:0% -16384:0% -32768:0% -66000:0%
RX states:410 probes:213 naks:0 defs:0 dups:0
TX states:410 probes:197 naks:0 acks:0 dups:0
Congestion bearer:0 link:0 Send queue max:1 avg:0

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2013-04-18 02:18:33 +0800
76f5c6f35 tipc: set skb->protocol in eth_media packet transmission ... Browse Code »

The skb->protocol field is used by packet classifiers and for AF_PACKET
cooked format, TIPC needs to set it properly.

Fixes packet classification and ethertype of 0x0000 in cooked captures:

Out 20:c9:d0:43:12:d9 ethertype Unknown (0x0000), length 56:
0x0000: 5b50 0028 0000 30d4 0100 1000 0100 1001 [P.(..0.........
0x0010: 0000 03e8 0000 0001 20c9 d043 12d9 0000 ...........C....
0x0020: 0000 0000 0000 0000 ........

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2013-04-18 02:18:33 +0800
8aeb89f21 tipc: move bcast_addr from struct tipc_media to struct tipc_bearer ... Browse Code »

Some network protocols, like InfiniBand, don't have a fixed broadcast
address but one that depends on the configuration. Move the bcast_addr
to struct tipc_bearer and initialize it with the broadcast address of
the network device when the bearer is enabled.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2013-04-18 02:18:33 +0800
ccc4ba2ea tipc: remove unused str2addr media callback ... Browse Code »

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2013-04-18 02:18:33 +0800

08 Apr, 2013

2 commits

d978a6361 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/nfc/microread/mei.c
net/netfilter/nfnetlink_queue_core.c

Pull in 'net' to get Eric Biederman's AF_UNIX fix, upon which
some cleanups are going to go on-top.

Signed-off-by: David S. Miller

David S. Miller
2013-04-08 06:37:01 +0800
60085c3d0 tipc: fix info leaks via msg_name in recv_msg/recv_stream ... Browse Code »

The code in set_orig_addr() does not initialize all of the members of
struct sockaddr_tipc when filling the sockaddr info -- namely the union
is only partly filled. This will make recv_msg() and recv_stream() --
the only users of this function -- leak kernel stack memory as the
msg_name member is a local variable in net/socket.c.

Additionally to that both recv_msg() and recv_stream() fail to update
the msg_namelen member to 0 while otherwise returning with 0, i.e.
"success". This is the case for, e.g., non-blocking sockets. This will
lead to a 128 byte kernel stack leak in net/socket.c.

Fix the first issue by initializing the memory of the union with
memset(0). Fix the second one by setting msg_namelen to 0 early as it
will be updated later if we're going to fill the msg_name member.

Cc: Jon Maloy
Cc: Allan Stephens
Signed-off-by: Mathias Krause
Signed-off-by: David S. Miller

Mathias Krause
2013-04-08 04:28:02 +0800

29 Mar, 2013

1 commit

573ce260b net-next: replace obsolete NLMSG_* with type safe nlmsg_* ... Browse Code »

Signed-off-by: Hong Zhiguo
Signed-off-by: David S. Miller

Hong zhi guo
2013-03-29 02:25:25 +0800

28 Feb, 2013

1 commit

b67bfe0d4 hlist: drop the node parameter from iterators ... Browse Code »

I'm not sure why, but the hlist for each entry iterators were conceived

list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin
Acked-by: Paul E. McKenney
Signed-off-by: Sasha Levin
Cc: Wu Fengguang
Cc: Marcelo Tosatti
Cc: Gleb Natapov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sasha Levin
2013-02-28 11:10:24 +0800

22 Feb, 2013

1 commit

06991c28f Merge tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core ... Browse Code »

Pull driver core patches from Greg Kroah-Hartman:
"Here is the big driver core merge for 3.9-rc1

There are two major series here, both of which touch lots of drivers
all over the kernel, and will cause you some merge conflicts:

- add a new function called devm_ioremap_resource() to properly be
able to check return values.

- remove CONFIG_EXPERIMENTAL

Other than those patches, there's not much here, some minor fixes and
updates"

Fix up trivial conflicts

* tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (221 commits)
base: memory: fix soft/hard_offline_page permissions
drivercore: Fix ordering between deferred_probe and exiting initcalls
backlight: fix class_find_device() arguments
TTY: mark tty_get_device call with the proper const values
driver-core: constify data for class_find_device()
firmware: Ignore abort check when no user-helper is used
firmware: Reduce ifdef CONFIG_FW_LOADER_USER_HELPER
firmware: Make user-mode helper optional
firmware: Refactoring for splitting user-mode helper code
Driver core: treat unregistered bus_types as having no devices
watchdog: Convert to devm_ioremap_resource()
thermal: Convert to devm_ioremap_resource()
spi: Convert to devm_ioremap_resource()
power: Convert to devm_ioremap_resource()
mtd: Convert to devm_ioremap_resource()
mmc: Convert to devm_ioremap_resource()
mfd: Convert to devm_ioremap_resource()
media: Convert to devm_ioremap_resource()
iommu: Convert to devm_ioremap_resource()
drm: Convert to devm_ioremap_resource()
...

Linus Torvalds
2013-02-22 04:05:51 +0800

19 Feb, 2013

1 commit

6338a53a2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net into net ... Browse Code »

Pull in 'net' to take in the bug fixes that didn't make it into
3.8-final.

Also, deal with the semantic conflict of the change made to
net/ipv6/xfrm6_policy.c A missing rt6->n neighbour release
was added to 'net', but in 'net-next' we no longer cache the
neighbour entries in the ipv6 routes so that change is not
appropriate there.

Signed-off-by: David S. Miller

David S. Miller
2013-02-19 12:34:21 +0800

16 Feb, 2013

4 commits

97f8b87e9 tipc: remove redundant checking for the number of iovecs in a send request ... Browse Code »

As the number of iovecs in a send request is already limited within
UIO_MAXIOV(i.e. 1024) in __sys_sendmsg(), it's unnecessary to check it
again in TIPC stack.

Signed-off-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker

Ying Xue
2013-02-16 06:03:32 +0800
aba79f332 tipc: byte-based overload control on socket receive queue ... Browse Code »

Change overload control to be purely byte-based, using
sk->sk_rmem_alloc as byte counter, and compare it to a calculated
upper limit for the socket receive queue.

For all connection messages, irrespective of message importance,
the overload limit is set to a constant value (i.e, 67MB). This
limit should normally never be reached because of the lower
limit used by the flow control algorithm, and is there only
as a last resort in case a faulty peer doesn't respect the send
window limit.

For datagram messages, message importance is taken into account
when calculating the overload limit. The calculation is based
on sk->sk_rcvbuf, and is hence configurable via the socket option
SO_RCVBUF.

Cc: Neil Horman
Signed-off-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker

Ying Xue
2013-02-16 06:01:58 +0800
57467e562 tipc: eliminate duplicated discard_rx_queue routine ... Browse Code »

The tipc function discard_rx_queue() is just a duplicated
implementation of __skb_queue_purge(). Remove the former
and directly invoke __skb_queue_purge().

In doing so, the underscores convey to the code reader, more
information about the current locking state that is assumed.

Signed-off-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker

Ying Xue
2013-02-16 05:10:13 +0800
c5c73dca5 tipc: fix missing spinlock init in broadcast code ... Browse Code »

After commit 3c294cb3 "tipc: remove the bearer congestion mechanism",
we try to grab the broadcast bearer lock when sending multicast
messages over the broadcast link. This will cause an oops because
the lock is never initialized. This is an old bug, but the lock
was never actually used before commit 3c294cb3, so that why it was
not visible until now. The oops will look something like:

BUG: spinlock bad magic on CPU#2, daemon/147
lock: bcast_bearer+0x48/0xffffffffffffd19a [tipc],
.magic: 00000000, .owner: /-1, .owner_cpu: 0
Pid: 147, comm: daemon Not tainted 3.8.0-rc3+ #206
Call Trace:
spin_dump+0x8a/0x8f
spin_bug+0x21/0x26
do_raw_spin_lock+0x114/0x150
_raw_spin_lock_bh+0x19/0x20
tipc_bearer_blocked+0x1f/0x40 [tipc]
tipc_link_send_buf+0x82/0x280 [tipc]
? __alloc_skb+0x9f/0x2b0
tipc_bclink_send_msg+0x77/0xa0 [tipc]
tipc_multicast+0x11b/0x1b0 [tipc]
send_msg+0x225/0x530 [tipc]
sock_sendmsg+0xca/0xe0

The above can be triggered by running the multicast demo program.

Signed-off-by: Erik Hugne
Signed-off-by: Paul Gortmaker
Signed-off-by: David S. Miller

Erik Hugne
2013-02-16 04:40:56 +0800

12 Jan, 2013

1 commit

f887cc48c net/tipc: remove depends on CONFIG_EXPERIMENTAL ... Browse Code »

The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.

CC: Jon Maloy
CC: Allan Stephens
CC: "David S. Miller"
Signed-off-by: Kees Cook
Acked-by: David S. Miller

Kees Cook
2013-01-12 03:40:02 +0800

08 Dec, 2012

3 commits

0fef8f205 tipc: refactor accept() code for improved readability ... Browse Code »

In TIPC's accept() routine, there is a large block of code relating
to initialization of a new socket, all within an if condition checking
if the allocation succeeded.

Here, we simply flip the check of the if, so that the main execution
path stays at the same indentation level, which improves readability.
If the allocation fails, we jump to an already existing exit label.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2012-12-08 06:23:24 +0800
258f8667a tipc: add lock nesting notation to quiet lockdep warning ... Browse Code »

TIPC accept() call grabs the socket lock on a newly allocated
socket while holding the socket lock on an old socket. But lockdep
worries that this might be a recursive lock attempt:

[ INFO: possible recursive locking detected ]
---------------------------------------------
kworker/u:0/6 is trying to acquire lock:
(sk_lock-AF_TIPC){+.+.+.}, at: [] accept+0x15c/0x310 [tipc]

but task is already holding lock:
(sk_lock-AF_TIPC){+.+.+.}, at: [] accept+0x28/0x310 [tipc]

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(sk_lock-AF_TIPC);
lock(sk_lock-AF_TIPC);

*** DEADLOCK ***

May be due to missing lock nesting notation
[...]

Tell lockdep that this locking is safe by using lock_sock_nested().
This is similar to what was done in commit 5131a184a3458d9 for
SCTP code ("SCTP: lock_sock_nested in sctp_sock_migrate").

Also note that this is isn't something that is seen normally,
as it was uncovered with some experimental work-in-progress
code not yet ready for mainline. So no need for stable
backports or similar of this commit.

Signed-off-by: Ying Xue
Signed-off-by: Paul Gortmaker

Ying Xue
2012-12-08 06:23:23 +0800
cbab36879 tipc: eliminate connection setup for implied connect in recv_msg() ... Browse Code »

As connection setup is now completed asynchronously in BH context,
in the function filter_connect(), the corresponding code in recv_msg()
becomes redundant.

Signed-off-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: Paul Gortmaker

Ying Xue
2012-12-08 06:23:22 +0800