Eric Lee / smarc-fsl-linux-kernel

31 Mar, 2020

1 commit

5a470b1a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Browse Code »

David S. Miller
2020-03-31 11:48:43 +0800

30 Mar, 2020

2 commits

582eea230 sctp: fix possibly using a bad saddr with a given dst ... Browse Code »

Under certain circumstances, depending on the order of addresses on the
interfaces, it could be that sctp_v[46]_get_dst() would return a dst
with a mismatched struct flowi.

For example, if when walking through the bind addresses and the first
one is not a match, it saves the dst as a fallback (added in
410f03831c07), but not the flowi. Then if the next one is also not a
match, the previous dst will be returned but with the flowi information
for the 2nd address, which is wrong.

The fix is to use a locally stored flowi that can be used for such
attempts, and copy it to the parameter only in case it is a possible
match, together with the corresponding dst entry.

The patch updates IPv6 code mostly just to be in sync. Even though the issue
is also present there, it fallback is not expected to work with IPv6.

Fixes: 410f03831c07 ("sctp: add routing output fallback")
Reported-by: Jin Meng
Signed-off-by: Marcelo Ricardo Leitner
Tested-by: Xin Long
Signed-off-by: David S. Miller

Marcelo Ricardo Leitner
2020-03-30 13:01:43 +0800
5c3e82fe1 sctp: fix refcount bug in sctp_wfree ... Browse Code »

We should iterate over the datamsgs to move
all chunks(skbs) to newsk.

The following case cause the bug:
for the trouble SKB, it was in outq->transmitted list

sctp_outq_sack
sctp_check_transmitted
SKB was moved to outq->sacked list
then throw away the sack queue
SKB was deleted from outq->sacked
(but it was held by datamsg at sctp_datamsg_to_asoc
So, sctp_wfree was not called here)

then migrate happened

sctp_for_each_tx_datachunk(
sctp_clear_owner_w);
sctp_assoc_migrate();
sctp_for_each_tx_datachunk(
sctp_set_owner_w);
SKB was not in the outq, and was not changed to newsk

finally

__sctp_outq_teardown
sctp_chunk_put (for another skb)
sctp_datamsg_put
__kfree_skb(msg->frag_list)
sctp_wfree (for SKB)
SKB->sk was still oldsk (skb->sk != asoc->base.sk).

Reported-and-tested-by: syzbot+cea71eec5d6de256d54d@syzkaller.appspotmail.com
Signed-off-by: Qiujun Huang
Acked-by: Marcelo Ricardo Leitner
Signed-off-by: David S. Miller

Qiujun Huang
2020-03-30 12:57:30 +0800

13 Mar, 2020

1 commit

1d3435793 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Minor overlapping changes, nothing serious.

Signed-off-by: David S. Miller

David S. Miller
2020-03-13 13:34:48 +0800

09 Mar, 2020

1 commit

83f73c5bb inet_diag: return classid for all socket types ... Browse Code »

In commit 1ec17dbd90f8 ("inet_diag: fix reporting cgroup classid and
fallback to priority") croup classid reporting was fixed. But this works
only for TCP sockets because for other socket types icsk parameter can
be NULL and classid code path is skipped. This change moves classid
handling to inet_diag_msg_attrs_fill() function.

Also inet_diag_msg_attrs_size() helper was added and addends in
nlmsg_new() were reordered to save order from inet_sk_diag_fill().

Fixes: 1ec17dbd90f8 ("inet_diag: fix reporting cgroup classid and fallback to priority")
Signed-off-by: Dmitry Yakunin
Reviewed-by: Konstantin Khlebnikov
Signed-off-by: David S. Miller

Dmitry Yakunin
2020-03-09 12:57:48 +0800

01 Mar, 2020

1 commit

9f0ca0c1a Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next ... Browse Code »

Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-02-28

The following pull-request contains BPF updates for your *net-next* tree.

We've added 41 non-merge commits during the last 7 day(s) which contain
a total of 49 files changed, 1383 insertions(+), 499 deletions(-).

The main changes are:

1) BPF and Real-Time nicely co-exist.

2) bpftool feature improvements.

3) retrieve bpf_sk_storage via INET_DIAG.
====================

Signed-off-by: David S. Miller

David S. Miller
2020-03-01 07:53:35 +0800

28 Feb, 2020

2 commits

0df6d3284 inet_diag: Move the INET_DIAG_REQ_BYTECODE nlattr to cb->data ... Browse Code »

The INET_DIAG_REQ_BYTECODE nlattr is currently re-found every time when
the "dump()" is re-started.

In a latter patch, it will also need to parse the new
INET_DIAG_REQ_SK_BPF_STORAGES nlattr to learn the map_fds. Thus, this
patch takes this chance to store the parsed nlattr in cb->data
during the "start" time of a dump.

By doing this, the "bc" argument also becomes unnecessary
and is removed. Also, the two copies of the INET_DIAG_REQ_BYTECODE
parsing-audit logic between compat/current version can be
consolidated to one.

Signed-off-by: Martin KaFai Lau
Signed-off-by: Alexei Starovoitov
Acked-by: Song Liu
Link: https://lore.kernel.org/bpf/20200225230415.1975555-1-kafai@fb.com

Martin KaFai Lau
2020-02-28 10:50:19 +0800
5682d393b inet_diag: Refactor inet_sk_diag_fill(), dump(), and dump_one() ... Browse Code »

In a latter patch, there is a need to update "cb->min_dump_alloc"
in inet_sk_diag_fill() as it learns the diffierent bpf_sk_storages
stored in a sk while dumping all sk(s) (e.g. tcp_hashinfo).

The inet_sk_diag_fill() currently does not take the "cb" as an argument.
One of the reason is inet_sk_diag_fill() is used by both dump_one()
and dump() (which belong to the "struct inet_diag_handler". The dump_one()
interface does not pass the "cb" along.

This patch is to make dump_one() pass a "cb". The "cb" is created in
inet_diag_cmd_exact(). The "nlh" and "in_skb" are stored in "cb" as
the dump() interface does. The total number of args in
inet_sk_diag_fill() is also cut from 10 to 7 and
that helps many callers to pass fewer args.

In particular,
"struct user_namespace *user_ns", "u32 pid", and "u32 seq"
can be replaced by accessing "cb->nlh" and "cb->skb".

A similar argument reduction is also made to
inet_twsk_diag_fill() and inet_req_diag_fill().

inet_csk_diag_dump() and inet_csk_diag_fill() are also removed.
They are mostly equivalent to inet_sk_diag_fill(). Their repeated
usages are very limited. Thus, inet_sk_diag_fill() is directly used
in those occasions.

Signed-off-by: Martin KaFai Lau
Signed-off-by: Alexei Starovoitov
Acked-by: Song Liu
Link: https://lore.kernel.org/bpf/20200225230409.1975173-1-kafai@fb.com

Martin KaFai Lau
2020-02-28 10:50:19 +0800

25 Feb, 2020

3 commits

b77b4f634 sctp: Add missing annotation for sctp_transport_walk_stop() ... Browse Code »

Sparse reports a warning at sctp_transport_walk_stop()

warning: context imbalance in sctp_transport_walk_stop
- wrong count at exit

The root cause is the missing annotation at sctp_transport_walk_stop()
Add the missing __releases(RCU) annotation

Signed-off-by: Jules Irenge
Signed-off-by: David S. Miller

Jules Irenge
2020-02-25 05:26:49 +0800
6c72b7740 sctp: Add missing annotation for sctp_transport_walk_start() ... Browse Code »

Sparse reports a warning at sctp_transport_walk_start()

warning: context imbalance in sctp_transport_walk_start
- wrong count at exit

The root cause is the missing annotation at sctp_transport_walk_start()
Add the missing __acquires(RCU) annotation

Signed-off-by: Jules Irenge
Signed-off-by: David S. Miller

Jules Irenge
2020-02-25 05:26:48 +0800
887cf3d13 sctp: Add missing annotation for sctp_err_finish() ... Browse Code »

Sparse reports a warning at sctp_err_finish()
warning: context imbalance in sctp_err_finish() - unexpected unlock

The root cause is a missing annotation at sctp_err_finish()
Add the missing __releases(&((__sk)->sk_lock.slock)) annotation

Signed-off-by: Jules Irenge
Signed-off-by: David S. Miller

Jules Irenge
2020-02-25 05:26:48 +0800

18 Feb, 2020

1 commit

245709ec8 sctp: move the format error check out of __sctp_sf_do_9_1_abort ... Browse Code »

When T2 timer is to be stopped, the asoc should also be deleted,
otherwise, there will be no chance to call sctp_association_free
and the asoc could last in memory forever.

However, in sctp_sf_shutdown_sent_abort(), after adding the cmd
SCTP_CMD_TIMER_STOP for T2 timer, it may return error due to the
format error from __sctp_sf_do_9_1_abort() and miss adding
SCTP_CMD_ASSOC_FAILED where the asoc will be deleted.

This patch is to fix it by moving the format error check out of
__sctp_sf_do_9_1_abort(), and do it before adding the cmd
SCTP_CMD_TIMER_STOP for T2 timer.

Thanks Hangbin for reporting this issue by the fuzz testing.

v1->v2:
- improve the comment in the code as Marcelo's suggestion.

Fixes: 96ca468b86b0 ("sctp: check invalid value of length parameter in error cause")
Reported-by: Hangbin Liu
Acked-by: Marcelo Ricardo Leitner
Signed-off-by: Xin Long
Signed-off-by: David S. Miller

Xin Long
2020-02-18 13:56:03 +0800

10 Jan, 2020

1 commit

a2d6d7ae5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

The ungrafting from PRIO bug fixes in net, when merged into net-next,
merge cleanly but create a build failure. The resolution used here is
from Petr Machata.

Signed-off-by: David S. Miller

David S. Miller
2020-01-10 04:13:43 +0800

07 Jan, 2020

1 commit

be7a77292 sctp: free cmd->obj.chunk for the unprocessed SCTP_CMD_REPLY ... Browse Code »

This patch is to fix a memleak caused by no place to free cmd->obj.chunk
for the unprocessed SCTP_CMD_REPLY. This issue occurs when failing to
process a cmd while there're still SCTP_CMD_REPLY cmds on the cmd seq
with an allocated chunk in cmd->obj.chunk.

So fix it by freeing cmd->obj.chunk for each SCTP_CMD_REPLY cmd left on
the cmd seq when any cmd returns error. While at it, also remove 'nomem'
label.

Reported-by: syzbot+107c4aff5f392bf1517f@syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long
Signed-off-by: David S. Miller

Xin Long
2020-01-07 05:28:37 +0800

01 Jan, 2020

1 commit

31d518f35 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Simple overlapping changes in bpf land wrt. bpf_helper_defs.h
handling.

Signed-off-by: David S. Miller

David S. Miller
2020-01-01 05:37:13 +0800

31 Dec, 2019

1 commit

f398efc14 sctp: add enabled check for path tracepoint loop. ... Browse Code »

sctp_outq_sack is the main function handles SACK, it is called very
frequently. As the commit "move trace_sctp_probe_path into sctp_outq_sack"
added below code to this function, sctp tracepoint is disabled most of time,
but the loop of transport list will be always called even though the
tracepoint is disabled, this is unnecessary.

+ /* SCTP path tracepoint for congestion control debugging. */
+ list_for_each_entry(transport, transport_list, transports) {
+ trace_sctp_probe_path(transport, asoc);
+ }

This patch is to add tracepoint enabled check at outside of the loop of
transport list, and avoid traversing the loop when trace is disabled,
it is a small optimization.

Signed-off-by: Kevin Kou
Acked-by: Marcelo Ricardo Leitner
Signed-off-by: David S. Miller

Kevin Kou
2019-12-31 12:33:05 +0800

28 Dec, 2019

1 commit

356b23c07 sctp: do trace_sctp_probe after SACK validation and check ... Browse Code »

The function sctp_sf_eat_sack_6_2 now performs the Verification
Tag validation, Chunk length validation, Bogu check, and also
the detection of out-of-order SACK based on the RFC2960
Section 6.2 at the beginning, and finally performs the further
processing of SACK. The trace_sctp_probe now triggered before
the above necessary validation and check.

this patch is to do the trace_sctp_probe after the chunk sanity
tests, but keep doing trace if the SACK received is out of order,
for the out-of-order SACK is valuable to congestion control
debugging.

v1->v2:
- keep doing SCTP trace if the SACK is out of order as Marcelo's
suggestion.
v2->v3:
- regenerate the patch as v2 generated on top of v1, and add
'net-next' tag to the new one as Marcelo's comments.

Signed-off-by: Kevin Kou
Acked-by: Marcelo Ricardo Leitner
Acked-by: Neil Horman
Signed-off-by: David S. Miller

Kevin Kou
2019-12-28 08:34:51 +0800

27 Dec, 2019

1 commit

f643ee295 sctp: move trace_sctp_probe_path into sctp_outq_sack ... Browse Code »

The original patch bringed in the "SCTP ACK tracking trace event"
feature was committed at Dec.20, 2017, it replaced jprobe usage
with trace events, and bringed in two trace events, one is
TRACE_EVENT(sctp_probe), another one is TRACE_EVENT(sctp_probe_path).
The original patch intended to trigger the trace_sctp_probe_path in
TRACE_EVENT(sctp_probe) as below code,

+TRACE_EVENT(sctp_probe,
+
+ TP_PROTO(const struct sctp_endpoint *ep,
+ const struct sctp_association *asoc,
+ struct sctp_chunk *chunk),
+
+ TP_ARGS(ep, asoc, chunk),
+
+ TP_STRUCT__entry(
+ __field(__u64, asoc)
+ __field(__u32, mark)
+ __field(__u16, bind_port)
+ __field(__u16, peer_port)
+ __field(__u32, pathmtu)
+ __field(__u32, rwnd)
+ __field(__u16, unack_data)
+ ),
+
+ TP_fast_assign(
+ struct sk_buff *skb = chunk->skb;
+
+ __entry->asoc = (unsigned long)asoc;
+ __entry->mark = skb->mark;
+ __entry->bind_port = ep->base.bind_addr.port;
+ __entry->peer_port = asoc->peer.port;
+ __entry->pathmtu = asoc->pathmtu;
+ __entry->rwnd = asoc->peer.rwnd;
+ __entry->unack_data = asoc->unack_data;
+
+ if (trace_sctp_probe_path_enabled()) {
+ struct sctp_transport *sp;
+
+ list_for_each_entry(sp, &asoc->peer.transport_addr_list,
+ transports) {
+ trace_sctp_probe_path(sp, asoc);
+ }
+ }
+ ),

But I found it did not work when I did testing, and trace_sctp_probe_path
had no output, I finally found that there is trace buffer lock
operation(trace_event_buffer_reserve) in include/trace/trace_events.h:

static notrace void \
trace_event_raw_event_##call(void *__data, proto) \
{ \
struct trace_event_file *trace_file = __data; \
struct trace_event_data_offsets_##call __maybe_unused __data_offsets;\
struct trace_event_buffer fbuffer; \
struct trace_event_raw_##call *entry; \
int __data_size; \
\
if (trace_trigger_soft_disabled(trace_file)) \
return; \
\
__data_size = trace_event_get_offsets_##call(&__data_offsets, args); \
\
entry = trace_event_buffer_reserve(&fbuffer, trace_file, \
sizeof(*entry) + __data_size); \
\
if (!entry) \
return; \
\
tstruct \
\
{ assign; } \
\
trace_event_buffer_commit(&fbuffer); \
}

The reason caused no output of trace_sctp_probe_path is that
trace_sctp_probe_path written in TP_fast_assign part of
TRACE_EVENT(sctp_probe), and it will be placed( { assign; } ) after the
trace_event_buffer_reserve() when compiler expands Macro,

entry = trace_event_buffer_reserve(&fbuffer, trace_file, \
sizeof(*entry) + __data_size); \
\
if (!entry) \
return; \
\
tstruct \
\
{ assign; } \

so trace_sctp_probe_path finally can not acquire trace_event_buffer
and return no output, that is to say the nest of tracepoint entry function
is not allowed. The function call flow is:

trace_sctp_probe()
-> trace_event_raw_event_sctp_probe()
-> lock buffer
-> trace_sctp_probe_path()
-> trace_event_raw_event_sctp_probe_path() --nested
-> buffer has been locked and return no output.

This patch is to remove trace_sctp_probe_path from the TP_fast_assign
part of TRACE_EVENT(sctp_probe) to avoid the nest of entry function,
and trigger sctp_probe_path_trace in sctp_outq_sack.

After this patch, you can enable both events individually,
# cd /sys/kernel/debug/tracing
# echo 1 > events/sctp/sctp_probe/enable
# echo 1 > events/sctp/sctp_probe_path/enable

Or, you can enable all the events under sctp.

# echo 1 > events/sctp/enable

Signed-off-by: Kevin Kou
Acked-by: Marcelo Ricardo Leitner
Signed-off-by: David S. Miller

Kevin Kou
2019-12-27 05:06:45 +0800

25 Dec, 2019

2 commits

bd085ef67 net: add bool confirm_neigh parameter for dst_ops.update_pmtu ... Browse Code »

The MTU update code is supposed to be invoked in response to real
networking events that update the PMTU. In IPv6 PMTU update function
__ip6_rt_update_pmtu() we called dst_confirm_neigh() to update neighbor
confirmed time.

But for tunnel code, it will call pmtu before xmit, like:
- tnl_update_pmtu()
- skb_dst_update_pmtu()
- ip6_rt_update_pmtu()
- __ip6_rt_update_pmtu()
- dst_confirm_neigh()

If the tunnel remote dst mac address changed and we still do the neigh
confirm, we will not be able to update neigh cache and ping6 remote
will failed.

So for this ip_tunnel_xmit() case, _EVEN_ if the MTU is changed, we
should not be invoking dst_confirm_neigh() as we have no evidence
of successful two-way communication at this point.

On the other hand it is also important to keep the neigh reachability fresh
for TCP flows, so we cannot remove this dst_confirm_neigh() call.

To fix the issue, we have to add a new bool parameter for dst_ops.update_pmtu
to choose whether we should do neigh update or not. I will add the parameter
in this patch and set all the callers to true to comply with the previous
way, and fix the tunnel code one by one on later patches.

v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.

Suggested-by: David Miller
Reviewed-by: Guillaume Nault
Acked-by: David Ahern
Signed-off-by: Hangbin Liu
Signed-off-by: David S. Miller

Hangbin Liu
2019-12-25 14:28:54 +0800
61d5d4062 sctp: fix err handling of stream initialization ... Browse Code »

The fix on 951c6db954a1 fixed the issued reported there but introduced
another. When the allocation fails within sctp_stream_init() it is
okay/necessary to free the genradix. But it is also called when adding
new streams, from sctp_send_add_streams() and
sctp_process_strreset_addstrm_in() and in those situations it cannot
just free the genradix because by then it is a fully operational
association.

The fix here then is to only free the genradix in sctp_stream_init()
and on those other call sites move on with what it already had and let
the subsequent error handling to handle it.

Tested with the reproducers from this report and the previous one,
with lksctp-tools and sctp-tests.

Reported-by: syzbot+9a1bc632e78a1a98488b@syzkaller.appspotmail.com
Fixes: 951c6db954a1 ("sctp: fix memleak on err handling of stream initialization")
Signed-off-by: Marcelo Ricardo Leitner
Signed-off-by: David S. Miller

Marcelo Ricardo Leitner
2019-12-25 08:07:10 +0800

23 Dec, 2019

1 commit

ac80010fc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Mere overlapping changes in the conflicts here.

Signed-off-by: David S. Miller

David S. Miller
2019-12-23 07:15:05 +0800

18 Dec, 2019

1 commit

951c6db95 sctp: fix memleak on err handling of stream initialization ... Browse Code »

syzbot reported a memory leak when an allocation fails within
genradix_prealloc() for output streams. That's because
genradix_prealloc() leaves initialized members initialized when the
issue happens and SCTP stack will abort the current initialization but
without cleaning up such members.

The fix here is to always call genradix_free() when genradix_prealloc()
fails, for output and also input streams, as it suffers from the same
issue.

Reported-by: syzbot+772d9e36c490b18d51d1@syzkaller.appspotmail.com
Fixes: 2075e50caf5e ("sctp: convert to genradix")
Signed-off-by: Marcelo Ricardo Leitner
Tested-by: Xin Long
Signed-off-by: David S. Miller

Marcelo Ricardo Leitner
2019-12-18 13:58:37 +0800

10 Dec, 2019

2 commits

4e7696d90 sctp: get netns from asoc and ep base ... Browse Code »

Commit 312434617cb1 ("sctp: cache netns in sctp_ep_common") set netns
in asoc and ep base since they're created, and it will never change.
It's a better way to get netns from asoc and ep base, comparing to
calling sock_net().

This patch is to replace them.

v1->v2:
- no change.

Suggested-by: Marcelo Ricardo Leitner
Signed-off-by: Xin Long
Acked-by: Neil Horman
Acked-by: Marcelo Ricardo Leitner
Signed-off-by: David S. Miller

Xin Long
2019-12-10 12:14:01 +0800
b6f3320b1 sctp: fully initialize v4 addr in some functions ... Browse Code »

Syzbot found a crash:

BUG: KMSAN: uninit-value in crc32_body lib/crc32.c:112 [inline]
BUG: KMSAN: uninit-value in crc32_le_generic lib/crc32.c:179 [inline]
BUG: KMSAN: uninit-value in __crc32c_le_base+0x4fa/0xd30 lib/crc32.c:202
Call Trace:
crc32_body lib/crc32.c:112 [inline]
crc32_le_generic lib/crc32.c:179 [inline]
__crc32c_le_base+0x4fa/0xd30 lib/crc32.c:202
chksum_update+0xb2/0x110 crypto/crc32c_generic.c:90
crypto_shash_update+0x4c5/0x530 crypto/shash.c:107
crc32c+0x150/0x220 lib/libcrc32c.c:47
sctp_csum_update+0x89/0xa0 include/net/sctp/checksum.h:36
__skb_checksum+0x1297/0x12a0 net/core/skbuff.c:2640
sctp_compute_cksum include/net/sctp/checksum.h:59 [inline]
sctp_packet_pack net/sctp/output.c:528 [inline]
sctp_packet_transmit+0x40fb/0x4250 net/sctp/output.c:597
sctp_outq_flush_transports net/sctp/outqueue.c:1146 [inline]
sctp_outq_flush+0x1823/0x5d80 net/sctp/outqueue.c:1194
sctp_outq_uncork+0xd0/0xf0 net/sctp/outqueue.c:757
sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1781 [inline]
sctp_side_effects net/sctp/sm_sideeffect.c:1184 [inline]
sctp_do_sm+0x8fe1/0x9720 net/sctp/sm_sideeffect.c:1155
sctp_primitive_REQUESTHEARTBEAT+0x175/0x1a0 net/sctp/primitive.c:185
sctp_apply_peer_addr_params+0x212/0x1d40 net/sctp/socket.c:2433
sctp_setsockopt_peer_addr_params net/sctp/socket.c:2686 [inline]
sctp_setsockopt+0x189bb/0x19090 net/sctp/socket.c:4672

The issue was caused by transport->ipaddr set with uninit addr param, which
was passed by:

sctp_transport_init net/sctp/transport.c:47 [inline]
sctp_transport_new+0x248/0xa00 net/sctp/transport.c:100
sctp_assoc_add_peer+0x5ba/0x2030 net/sctp/associola.c:611
sctp_process_param net/sctp/sm_make_chunk.c:2524 [inline]

where 'addr' is set by sctp_v4_from_addr_param(), and it doesn't initialize
the padding of addr->v4.

Later when calling sctp_make_heartbeat(), hbinfo.daddr(=transport->ipaddr)
will become the part of skb, and the issue occurs.

This patch is to fix it by initializing the padding of addr->v4 in
sctp_v4_from_addr_param(), as well as other functions that do the similar
thing, and these functions shouldn't trust that the caller initializes the
memory, as Marcelo suggested.

Reported-by: syzbot+6dcbfea81cd3d4dd0b02@syzkaller.appspotmail.com
Signed-off-by: Xin Long
Acked-by: Neil Horman
Signed-off-by: David S. Miller

Xin Long
2019-12-10 02:16:39 +0800

05 Dec, 2019

1 commit

c4e85f73a net: ipv6: add net argument to ip6_dst_lookup_flow ... Browse Code »

This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow,
as some modules currently pass a net argument without a socket to
ip6_dst_lookup. This is equivalent to commit 343d60aada5a ("ipv6: change
ipv6_stub_impl.ipv6_dst_lookup to take net argument").

Signed-off-by: Sabrina Dubroca
Signed-off-by: David S. Miller

Sabrina Dubroca
2019-12-05 04:27:12 +0800

27 Nov, 2019

2 commits

82f31ebf6 net: port < inet_prot_sock(net) --> inet_port_requires_bind_service(net, port) ... Browse Code »

Note that the sysctl write accessor functions guarantee that:
net->ipv4.sysctl_ip_prot_sock ipv4.ip_local_ports.range[0]
invariant is maintained, and as such the max() in selinux hooks is actually spurious.

ie. even though
if (snum < max(inet_prot_sock(sock_net(sk)), low) || snum > high) {
per logic is the same as
if ((snum < inet_prot_sock(sock_net(sk)) && snum < low) || snum > high) {
it is actually functionally equivalent to:
if (snum < low || snum > high) {
which is equivalent to:
if (snum < inet_prot_sock(sock_net(sk)) || snum < low || snum > high) {
even though the first clause is spurious.

But we want to hold on to it in case we ever want to change what what
inet_port_requires_bind_service() means (for example by changing
it from a, by default, [0..1024) range to some sort of set).

Test: builds, git 'grep inet_prot_sock' finds no other references
Cc: Eric Dumazet
Signed-off-by: Maciej Żenczykowski
Signed-off-by: David S. Miller

Maciej Żenczykowski
2019-11-27 05:20:46 +0800
fb8223888 net-sctp: replace some sock_net(sk) with just 'net' ... Browse Code »

It already existed in part of the function, but move it
to a higher level and use it consistently throughout.

Safe since sk is never written to.

Signed-off-by: Maciej Żenczykowski
Acked-by: Marcelo Ricardo Leitner
Signed-off-by: David S. Miller

Maciej Żenczykowski
2019-11-27 05:18:32 +0800

26 Nov, 2019

1 commit

adf6f8cb3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Merge in networking bug fixes for merge window.

Signed-off-by: David S. Miller

David S. Miller
2019-11-26 06:57:26 +0800

24 Nov, 2019

2 commits

312434617 sctp: cache netns in sctp_ep_common ... Browse Code »

This patch is to fix a data-race reported by syzbot:

BUG: KCSAN: data-race in sctp_assoc_migrate / sctp_hash_obj

write to 0xffff8880b67c0020 of 8 bytes by task 18908 on cpu 1:
sctp_assoc_migrate+0x1a6/0x290 net/sctp/associola.c:1091
sctp_sock_migrate+0x8aa/0x9b0 net/sctp/socket.c:9465
sctp_accept+0x3c8/0x470 net/sctp/socket.c:4916
inet_accept+0x7f/0x360 net/ipv4/af_inet.c:734
__sys_accept4+0x224/0x430 net/socket.c:1754
__do_sys_accept net/socket.c:1795 [inline]
__se_sys_accept net/socket.c:1792 [inline]
__x64_sys_accept+0x4e/0x60 net/socket.c:1792
do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x44/0xa9

read to 0xffff8880b67c0020 of 8 bytes by task 12003 on cpu 0:
sctp_hash_obj+0x4f/0x2d0 net/sctp/input.c:894
rht_key_get_hash include/linux/rhashtable.h:133 [inline]
rht_key_hashfn include/linux/rhashtable.h:159 [inline]
rht_head_hashfn include/linux/rhashtable.h:174 [inline]
head_hashfn lib/rhashtable.c:41 [inline]
rhashtable_rehash_one lib/rhashtable.c:245 [inline]
rhashtable_rehash_chain lib/rhashtable.c:276 [inline]
rhashtable_rehash_table lib/rhashtable.c:316 [inline]
rht_deferred_worker+0x468/0xab0 lib/rhashtable.c:420
process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
worker_thread+0xa0/0x800 kernel/workqueue.c:2415
kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352

It was caused by rhashtable access asoc->base.sk when sctp_assoc_migrate
is changing its value. However, what rhashtable wants is netns from asoc
base.sk, and for an asoc, its netns won't change once set. So we can
simply fix it by caching netns since created.

Fixes: d6c0256a60e6 ("sctp: add the rhashtable apis for sctp global transport hashtable")
Reported-by: syzbot+e3b35fe7918ff0ee474e@syzkaller.appspotmail.com
Signed-off-by: Xin Long
Acked-by: Marcelo Ricardo Leitner
Signed-off-by: Jakub Kicinski

Xin Long
2019-11-24 10:26:14 +0800
b6631c603 sctp: Fix memory leak in sctp_sf_do_5_2_4_dupcook ... Browse Code »

In the implementation of sctp_sf_do_5_2_4_dupcook() the allocated
new_asoc is leaked if security_sctp_assoc_request() fails. Release it
via sctp_association_free().

Fixes: 2277c7cd75e3 ("sctp: Add LSM hooks")
Signed-off-by: Navid Emamdoost
Acked-by: Marcelo Ricardo Leitner
Signed-off-by: Jakub Kicinski

Navid Emamdoost
2019-11-24 10:20:17 +0800

09 Nov, 2019

5 commits

d467ac0a3 sctp: add SCTP_PEER_ADDR_THLDS_V2 sockopt ... Browse Code »

Section 7.2 of rfc7829: "Peer Address Thresholds (SCTP_PEER_ADDR_THLDS)
Socket Option" extends 'struct sctp_paddrthlds' with 'spt_pathcpthld'
added to allow a user to change ps_retrans per sock/asoc/transport, as
other 2 paddrthlds: pf_retrans, pathmaxrxt.

Note: to not break the user's program, here to support pf_retrans dump
and setting by adding a new sockopt SCTP_PEER_ADDR_THLDS_V2, and a new
structure sctp_paddrthlds_v2 instead of extending sctp_paddrthlds.

Also, when setting ps_retrans, the value is not allowed to be greater
than pf_retrans.

v1->v2:
- use SCTP_PEER_ADDR_THLDS_V2 to set/get pf_retrans instead,
as Marcelo and David Laight suggested.

Signed-off-by: Xin Long
Acked-by: Neil Horman
Signed-off-by: David S. Miller

Xin Long
2019-11-09 06:18:32 +0800
34515e94c sctp: add support for Primary Path Switchover ... Browse Code »

This is a new feature defined in section 5 of rfc7829: "Primary Path
Switchover". By introducing a new tunable parameter:

Primary.Switchover.Max.Retrans (PSMR)

The primary path will be changed to another active path when the path
error counter on the old primary path exceeds PSMR, so that "the SCTP
sender is allowed to continue data transmission on a new working path
even when the old primary destination address becomes active again".

This patch is to add this tunable parameter, 'ps_retrans' per netns,
sock, asoc and transport. It also allows a user to change ps_retrans
per netns by sysctl, and ps_retrans per sock/asoc/transport will be
initialized with it.

The check will be done in sctp_do_8_2_transport_strike() when this
feature is enabled.

Note this feature is disabled by initializing 'ps_retrans' per netns
as 0xffff by default, and its value can't be less than 'pf_retrans'
when changing by sysctl.

v3->v4:
- add define SCTP_PS_RETRANS_MAX 0xffff, and use it on extra2 of
sysctl 'ps_retrans'.
- add a new entry for ps_retrans on ip-sysctl.txt.

Signed-off-by: Xin Long
Acked-by: Neil Horman
Signed-off-by: David S. Miller

Xin Long
2019-11-09 06:18:32 +0800
8d2a6935d sctp: add SCTP_EXPOSE_POTENTIALLY_FAILED_STATE sockopt ... Browse Code »

This is a sockopt defined in section 7.3 of rfc7829: "Exposing
the Potentially Failed Path State", by which users can change
pf_expose per sock and asoc.

The new sockopt SCTP_EXPOSE_POTENTIALLY_FAILED_STATE is also
known as SCTP_EXPOSE_PF_STATE for short.

v2->v3:
- return -EINVAL if params.assoc_value > SCTP_PF_EXPOSE_MAX.
- define SCTP_EXPOSE_PF_STATE SCTP_EXPOSE_POTENTIALLY_FAILED_STATE.
v3->v4:
- improve changelog.

Signed-off-by: Xin Long
Acked-by: Neil Horman
Signed-off-by: David S. Miller

Xin Long
2019-11-09 06:18:32 +0800
768e15182 sctp: add SCTP_ADDR_POTENTIALLY_FAILED notification ... Browse Code »

SCTP Quick failover draft section 5.1, point 5 has been removed
from rfc7829. Instead, "the sender SHOULD (i) notify the Upper
Layer Protocol (ULP) about this state transition", as said in
section 3.2, point 8.

So this patch is to add SCTP_ADDR_POTENTIALLY_FAILED, defined
in section 7.1, "which is reported if the affected address
becomes PF". Also remove transport cwnd's update when moving
from PF back to ACTIVE , which is no longer in rfc7829 either.

Note that ulp_notify will be set to false if asoc->expose is
not 'enabled', according to last patch.

v2->v3:
- define SCTP_ADDR_PF SCTP_ADDR_POTENTIALLY_FAILED.
v3->v4:
- initialize spc_state with SCTP_ADDR_AVAILABLE, as Marcelo suggested.
- check asoc->pf_expose in sctp_assoc_control_transport(), as Marcelo
suggested.

Signed-off-by: Xin Long
Acked-by: Neil Horman
Signed-off-by: David S. Miller

Xin Long
2019-11-09 06:18:32 +0800
aef587be4 sctp: add pf_expose per netns and sock and asoc ... Browse Code »

As said in rfc7829, section 3, point 12:

The SCTP stack SHOULD expose the PF state of its destination
addresses to the ULP as well as provide the means to notify the
ULP of state transitions of its destination addresses from
active to PF, and vice versa. However, it is recommended that
an SCTP stack implementing SCTP-PF also allows for the ULP to be
kept ignorant of the PF state of its destinations and the
associated state transitions, thus allowing for retention of the
simpler state transition model of [RFC4960] in the ULP.

Not only does it allow to expose the PF state to ULP, but also
allow to ignore sctp-pf to ULP.

So this patch is to add pf_expose per netns, sock and asoc. And in
sctp_assoc_control_transport(), ulp_notify will be set to false if
asoc->expose is not 'enabled' in next patch.

It also allows a user to change pf_expose per netns by sysctl, and
pf_expose per sock and asoc will be initialized with it.

Note that pf_expose also works for SCTP_GET_PEER_ADDR_INFO sockopt,
to not allow a user to query the state of a sctp-pf peer address
when pf_expose is 'disabled', as said in section 7.3.

v1->v2:
- Fix a build warning noticed by Nathan Chancellor.
v2->v3:
- set pf_expose to UNUSED by default to keep compatible with old
applications.
v3->v4:
- add a new entry for pf_expose on ip-sysctl.txt, as Marcelo suggested.
- change this patch to 1/5, and move sctp_assoc_control_transport
change into 2/5, as Marcelo suggested.
- use SCTP_PF_EXPOSE_UNSET instead of SCTP_PF_EXPOSE_UNUSED, and
set SCTP_PF_EXPOSE_UNSET to 0 in enum, as Marcelo suggested.

Signed-off-by: Xin Long
Acked-by: Neil Horman
Signed-off-by: David S. Miller

Xin Long
2019-11-09 06:18:32 +0800

07 Nov, 2019

3 commits

099ecf59f net: annotate lockless accesses to sk->sk_max_ack_backlog ... Browse Code »

sk->sk_max_ack_backlog can be read without any lock being held
at least in TCP/DCCP cases.

We need to use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing
and/or potential KCSAN warnings.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2019-11-07 08:14:48 +0800
288efe860 net: annotate lockless accesses to sk->sk_ack_backlog ... Browse Code »

sk->sk_ack_backlog can be read without any lock being held.
We need to use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing
and/or potential KCSAN warnings.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2019-11-07 08:14:48 +0800
7976a11b3 net: use helpers to change sk_ack_backlog ... Browse Code »

Writers are holding a lock, but many readers do not.

Following patch will add appropriate barriers in
sk_acceptq_removed() and sk_acceptq_added().

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2019-11-07 08:14:48 +0800

03 Nov, 2019

1 commit

d31e95585 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

The only slightly tricky merge conflict was the netdevsim because the
mutex locking fix overlapped a lot of driver reload reorganization.

The rest were (relatively) trivial in nature.

Signed-off-by: David S. Miller

David S. Miller
2019-11-03 04:54:56 +0800

02 Nov, 2019

1 commit

a904a0693 inet: stop leaking jiffies on the wire ... Browse Code »

Historically linux tried to stick to RFC 791, 1122, 2003
for IPv4 ID field generation.

RFC 6864 made clear that no matter how hard we try,
we can not ensure unicity of IP ID within maximum
lifetime for all datagrams with a given source
address/destination address/protocol tuple.

Linux uses a per socket inet generator (inet_id), initialized
at connection startup with a XOR of 'jiffies' and other
fields that appear clear on the wire.

Thiemo Nagel pointed that this strategy is a privacy
concern as this provides 16 bits of entropy to fingerprint
devices.

Let's switch to a random starting point, this is just as
good as far as RFC 6864 is concerned and does not leak
anything critical.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet
Reported-by: Thiemo Nagel
Signed-off-by: David S. Miller

Eric Dumazet
2019-11-02 05:57:52 +0800