Eric Lee / smarc-fsl-linux-kernel

02 Mar, 2017

12 commits

13baa00ad net: net_enable_timestamp() can be called from irq contexts ... Browse Code »

It is now very clear that silly TCP listeners might play with
enabling/disabling timestamping while new children are added
to their accept queue.

Meaning net_enable_timestamp() can be called from BH context
while current state of the static key is not enabled.

Lets play safe and allow all contexts.

The work queue is scheduled only under the problematic cases,
which are the static key enable/disable transition, to not slow down
critical paths.

This extends and improves what we did in commit 5fa8bbda38c6 ("net: use
a work queue to defer net_disable_timestamp() work")

Fixes: b90e5794c5bd ("net: dont call jump_label_dec from irq context")
Signed-off-by: Eric Dumazet
Reported-by: Dmitry Vyukov
Signed-off-by: David S. Miller

Eric Dumazet
2017-03-02 12:55:57 +0800
540e2894f net: don't call strlen() on the user buffer in packet_bind_spkt() ... Browse Code »

KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
uninitialized memory in packet_bind_spkt():
Acked-by: Eric Dumazet

==================================================================
BUG: KMSAN: use of unitialized memory
CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
0000000000000000 ffff88006b6dfc08 ffffffff82559ae8 ffff88006b6dfb48
ffffffff818a7c91 ffffffff85b9c870 0000000000000092 ffffffff85b9c550
0000000000000000 0000000000000092 00000000ec400911 0000000000000002
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[] dump_stack+0x238/0x290 lib/dump_stack.c:51
[] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003
[] __msan_warning+0x5b/0xb0
mm/kmsan/kmsan_instr.c:424
[< inline >] strlen lib/string.c:484
[] strlcpy+0x9d/0x200 lib/string.c:144
[] packet_bind_spkt+0x144/0x230
net/packet/af_packet.c:3132
[] SYSC_bind+0x40d/0x5f0 net/socket.c:1370
[] SyS_bind+0x82/0xa0 net/socket.c:1356
[] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
chained origin: 00000000eba00911
[] save_stack_trace+0x27/0x50
arch/x86/kernel/stacktrace.c:67
[< inline >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
[< inline >] kmsan_save_stack mm/kmsan/kmsan.c:334
[] kmsan_internal_chain_origin+0x118/0x1e0
mm/kmsan/kmsan.c:527
[] __msan_set_alloca_origin4+0xc3/0x130
mm/kmsan/kmsan_instr.c:380
[] SYSC_bind+0x129/0x5f0 net/socket.c:1356
[] SyS_bind+0x82/0xa0 net/socket.c:1356
[] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
origin description: ----address@SYSC_bind (origin=00000000eb400911)
==================================================================
(the line numbers are relative to 4.8-rc6, but the bug persists
upstream)

, when I run the following program as root:

=====================================
#include
#include
#include
#include

int main() {
struct sockaddr addr;
memset(&addr, 0xff, sizeof(addr));
addr.sa_family = AF_PACKET;
int fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL));
bind(fd, &addr, sizeof(addr));
return 0;
}
=====================================

This happens because addr.sa_data copied from the userspace is not
zero-terminated, and copying it with strlcpy() in packet_bind_spkt()
results in calling strlen() on the kernel copy of that non-terminated
buffer.

Signed-off-by: Alexander Potapenko
Signed-off-by: David S. Miller

Alexander Potapenko
2017-03-02 12:55:57 +0800
8953de2f0 net: bridge: allow IPv6 when multicast flood is disabled ... Browse Code »

Even with multicast flooding turned off, IPv6 ND should still work so
that IPv6 connectivity is provided. Allow this by continuing to flood
multicast traffic originated by us.

Fixes: b6cb5ac8331b ("net: bridge: add per-port multicast flood flag")
Cc: Nikolay Aleksandrov
Signed-off-by: Mike Manning
Acked-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Mike Manning
2017-03-02 12:55:57 +0800
16c54ac90 Merge tag 'mac80211-for-davem-2017-02-28' of git://git.kernel.org/pub/scm/linux/… ... Browse Code »

…kernel/git/jberg/mac80211

Johannes Berg says:

====================
First round of fixes - details in the commits:
* use a valid hrtimer clock ID in mac80211_hwsim
* don't reorder frames prior to BA session
* flush a delayed work at suspend so the state is all valid before
suspend/resume
* fix packet statistics in fast-RX, the RX packets
counter increment was simply missing
* don't try to re-transmit filtered frames in an aggregation session
* shorten (for tracing) a debug message
* typo fix in another debug message
* fix nul-termination with HWSIM_ATTR_RADIO_NAME in hwsim
* fix mgmt RX processing when station is looked up by driver/device
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

David S. Miller
2017-03-02 07:08:34 +0800
449809a66 tcp/dccp: block BH for SYN processing ... Browse Code »

SYN processing really was meant to be handled from BH.

When I got rid of BH blocking while processing socket backlog
in commit 5413d1babe8f ("net: do not block BH while processing socket
backlog"), I forgot that a malicious user could transition to TCP_LISTEN
from a state that allowed (SYN) packets to be parked in the socket
backlog while socket is owned by the thread doing the listen() call.

Sure enough syzkaller found this and reported the bug ;)

=================================
[ INFO: inconsistent lock state ]
4.10.0+ #60 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
syz-executor0/5090 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
[] spin_lock include/linux/spinlock.h:299 [inline]
(&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
[] inet_ehash_insert+0x240/0xad0
net/ipv4/inet_hashtables.c:407
{IN-SOFTIRQ-W} state was registered at:
mark_irqflags kernel/locking/lockdep.c:2923 [inline]
__lock_acquire+0xbcf/0x3270 kernel/locking/lockdep.c:3295
lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
_raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
spin_lock include/linux/spinlock.h:299 [inline]
inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
tcp_conn_request+0x25cc/0x3310 net/ipv4/tcp_input.c:6399
tcp_v4_conn_request+0x157/0x220 net/ipv4/tcp_ipv4.c:1262
tcp_rcv_state_process+0x802/0x4130 net/ipv4/tcp_input.c:5889
tcp_v4_do_rcv+0x56b/0x940 net/ipv4/tcp_ipv4.c:1433
tcp_v4_rcv+0x2e12/0x3210 net/ipv4/tcp_ipv4.c:1711
ip_local_deliver_finish+0x4ce/0xc40 net/ipv4/ip_input.c:216
NF_HOOK include/linux/netfilter.h:257 [inline]
ip_local_deliver+0x1ce/0x710 net/ipv4/ip_input.c:257
dst_input include/net/dst.h:492 [inline]
ip_rcv_finish+0xb1d/0x2110 net/ipv4/ip_input.c:396
NF_HOOK include/linux/netfilter.h:257 [inline]
ip_rcv+0xd90/0x19c0 net/ipv4/ip_input.c:487
__netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4179
__netif_receive_skb+0x2a/0x170 net/core/dev.c:4217
netif_receive_skb_internal+0x1d6/0x430 net/core/dev.c:4245
napi_skb_finish net/core/dev.c:4602 [inline]
napi_gro_receive+0x4e6/0x680 net/core/dev.c:4636
e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4033 [inline]
e1000_clean_rx_irq+0x5e0/0x1490
drivers/net/ethernet/intel/e1000/e1000_main.c:4489
e1000_clean+0xb9a/0x2910 drivers/net/ethernet/intel/e1000/e1000_main.c:3834
napi_poll net/core/dev.c:5171 [inline]
net_rx_action+0xe70/0x1900 net/core/dev.c:5236
__do_softirq+0x2fb/0xb7d kernel/softirq.c:284
invoke_softirq kernel/softirq.c:364 [inline]
irq_exit+0x19e/0x1d0 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:658 [inline]
do_IRQ+0x81/0x1a0 arch/x86/kernel/irq.c:250
ret_from_intr+0x0/0x20
native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
default_idle+0x8f/0x410 arch/x86/kernel/process.c:271
arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:262
default_idle_call+0x36/0x60 kernel/sched/idle.c:96
cpuidle_idle_call kernel/sched/idle.c:154 [inline]
do_idle+0x348/0x440 kernel/sched/idle.c:243
cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:345
start_secondary+0x344/0x440 arch/x86/kernel/smpboot.c:272
verify_cpu+0x0/0xfc
irq event stamp: 1741
hardirqs last enabled at (1741): []
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160
[inline]
hardirqs last enabled at (1741): []
_raw_spin_unlock_irqrestore+0xf7/0x1a0 kernel/locking/spinlock.c:191
hardirqs last disabled at (1740): []
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
hardirqs last disabled at (1740): []
_raw_spin_lock_irqsave+0xa2/0x110 kernel/locking/spinlock.c:159
softirqs last enabled at (1738): []
__do_softirq+0x7cf/0xb7d kernel/softirq.c:310
softirqs last disabled at (1571): []
do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&(&hashinfo->ehash_locks[i])->rlock);

lock(&(&hashinfo->ehash_locks[i])->rlock);

*** DEADLOCK ***

1 lock held by syz-executor0/5090:
#0: (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock
include/net/sock.h:1460 [inline]
#0: (sk_lock-AF_INET6){+.+.+.}, at: []
sock_setsockopt+0x233/0x1e40 net/core/sock.c:683

stack backtrace:
CPU: 1 PID: 5090 Comm: syz-executor0 Not tainted 4.10.0+ #60
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:15 [inline]
dump_stack+0x292/0x398 lib/dump_stack.c:51
print_usage_bug+0x3ef/0x450 kernel/locking/lockdep.c:2387
valid_state kernel/locking/lockdep.c:2400 [inline]
mark_lock_irq kernel/locking/lockdep.c:2602 [inline]
mark_lock+0xf30/0x1410 kernel/locking/lockdep.c:3065
mark_irqflags kernel/locking/lockdep.c:2941 [inline]
__lock_acquire+0x6dc/0x3270 kernel/locking/lockdep.c:3295
lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
_raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
spin_lock include/linux/spinlock.h:299 [inline]
inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
dccp_v6_conn_request+0xada/0x11b0 net/dccp/ipv6.c:380
dccp_rcv_state_process+0x51e/0x1660 net/dccp/input.c:606
dccp_v6_do_rcv+0x213/0x350 net/dccp/ipv6.c:632
sk_backlog_rcv include/net/sock.h:896 [inline]
__release_sock+0x127/0x3a0 net/core/sock.c:2052
release_sock+0xa5/0x2b0 net/core/sock.c:2539
sock_setsockopt+0x60f/0x1e40 net/core/sock.c:1016
SYSC_setsockopt net/socket.c:1782 [inline]
SyS_setsockopt+0x2fb/0x3a0 net/socket.c:1765
entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4458b9
RSP: 002b:00007fe8b26c2b58 EFLAGS: 00000292 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00000000004458b9
RDX: 000000000000001a RSI: 0000000000000001 RDI: 0000000000000006
RBP: 00000000006e2110 R08: 0000000000000010 R09: 0000000000000000
R10: 00000000208c3000 R11: 0000000000000292 R12: 0000000000708000
R13: 0000000020000000 R14: 0000000000001000 R15: 0000000000000000

Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet
Reported-by: Andrey Konovalov
Acked-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller

Eric Dumazet
2017-03-02 07:03:31 +0800
df2c43343 bridge: Fix error path in nbp_vlan_init ... Browse Code »

Fix error path order in nbp_vlan_init, so if switchdev_port_attr_set
call failes, the vlan_hash wouldn't be destroyed before inited.

Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support")
CC: Roopa Prabhu
Signed-off-by: Yotam Gigi
Acked-by: Roopa Prabhu
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller

Yotam Gigi
2017-03-02 06:55:28 +0800
3b45a4106 net: route: add missing nla_policy entry for RTA_MARK attribute ... Browse Code »

This will add stricter validating for RTA_MARK attribute.

Signed-off-by: Liping Zhang
Signed-off-by: David S. Miller

Liping Zhang
2017-03-02 02:25:56 +0800
8c171d6ca net/ipv6: avoid possible dead locking on addr_gen_mode sysctl ... Browse Code »

The addr_gen_mode variable can be accessed by both sysctl and netlink.
Repleacd rtnl_lock() with rtnl_trylock() protect the sysctl operation to
avoid the possbile dead lock.`

Signed-off-by: Felix Jia
Signed-off-by: David S. Miller

Felix Jia
2017-03-02 02:22:48 +0800
39e6c8208 net: solve a NAPI race ... Browse Code »

While playing with mlx4 hardware timestamping of RX packets, I found
that some packets were received by TCP stack with a ~200 ms delay...

Since the timestamp was provided by the NIC, and my probe was added
in tcp_v4_rcv() while in BH handler, I was confident it was not
a sender issue, or a drop in the network.

This would happen with a very low probability, but hurting RPC
workloads.

A NAPI driver normally arms the IRQ after the napi_complete_done(),
after NAPI_STATE_SCHED is cleared, so that the hard irq handler can grab
it.

Problem is that if another point in the stack grabs NAPI_STATE_SCHED bit
while IRQ are not disabled, we might have later an IRQ firing and
finding this bit set, right before napi_complete_done() clears it.

This can happen with busy polling users, or if gro_flush_timeout is
used. But some other uses of napi_schedule() in drivers can cause this
as well.

thread 1 thread 2 (could be on same cpu, or not)

// busy polling or napi_watchdog()
napi_schedule();
...
napi->poll()

device polling:
read 2 packets from ring buffer
Additional 3rd packet is
available.
device hard irq

// does nothing because
NAPI_STATE_SCHED bit is owned by thread 1
napi_schedule();

napi_complete_done(napi, 2);
rearm_irq();

Note that rearm_irq() will not force the device to send an additional
IRQ for the packet it already signaled (3rd packet in my example)

This patch adds a new NAPI_STATE_MISSED bit, that napi_schedule_prep()
can set if it could not grab NAPI_STATE_SCHED

Then napi_complete_done() properly reschedules the napi to make sure
we do not miss something.

Since we manipulate multiple bits at once, use cmpxchg() like in
sk_busy_loop() to provide proper transactions.

In v2, I changed napi_watchdog() to use a relaxed variant of
napi_schedule_prep() : No need to set NAPI_STATE_MISSED from this point.

In v3, I added more details in the changelog and clears
NAPI_STATE_MISSED in busy_poll_stop()

In v4, I added the ideas given by Alexander Duyck in v3 review

Signed-off-by: Eric Dumazet
Cc: Alexander Duyck
Signed-off-by: David S. Miller

Eric Dumazet
2017-03-02 01:50:58 +0800
4f7bfb398 rds: ib: add the static type to the variables ... Browse Code »

The variables rds_ib_mr_1m_pool_size and rds_ib_mr_8k_pool_size
are used only in the ib.c file. As such, the static type is
added to limit them in this file.

Cc: Joe Jin
Cc: Junxiao Bi
Signed-off-by: Zhu Yanjun
Acked-by: Santosh Shilimkar
Signed-off-by: David S. Miller

Zhu Yanjun
2017-03-02 01:50:58 +0800
5179b2669 sctp: call rcu_read_lock before checking for duplicate transport nodes ... Browse Code »

Commit cd2b70875058 ("sctp: check duplicate node before inserting a
new transport") called rhltable_lookup() to check for the duplicate
transport node in transport rhashtable.

But rhltable_lookup() doesn't call rcu_read_lock inside, it could cause
a use-after-free issue if it tries to dereference the node that another
cpu has freed it. Note that sock lock can not avoid this as it is per
sock.

This patch is to fix it by calling rcu_read_lock before checking for
duplicate transport nodes.

Fixes: cd2b70875058 ("sctp: check duplicate node before inserting a new transport")
Reported-by: Andrey Konovalov
Signed-off-by: Xin Long
Acked-by: Neil Horman
Signed-off-by: David S. Miller

Xin Long
2017-03-02 01:50:58 +0800
540b1c48c rxrpc: Fix deadlock between call creation and sendmsg/recvmsg ... Browse Code »

All the routines by which rxrpc is accessed from the outside are serialised
by means of the socket lock (sendmsg, recvmsg, bind,
rxrpc_kernel_begin_call(), ...) and this presents a problem:

(1) If a number of calls on the same socket are in the process of
connection to the same peer, a maximum of four concurrent live calls
are permitted before further calls need to wait for a slot.

(2) If a call is waiting for a slot, it is deep inside sendmsg() or
rxrpc_kernel_begin_call() and the entry function is holding the socket
lock.

(3) sendmsg() and recvmsg() or the in-kernel equivalents are prevented
from servicing the other calls as they need to take the socket lock to
do so.

(4) The socket is stuck until a call is aborted and makes its slot
available to the waiter.

Fix this by:

(1) Provide each call with a mutex ('user_mutex') that arbitrates access
by the users of rxrpc separately for each specific call.

(2) Make rxrpc_sendmsg() and rxrpc_recvmsg() unlock the socket as soon as
they've got a call and taken its mutex.

Note that I'm returning EWOULDBLOCK from recvmsg() if MSG_DONTWAIT is
set but someone else has the lock. Should I instead only return
EWOULDBLOCK if there's nothing currently to be done on a socket, and
sleep in this particular instance because there is something to be
done, but we appear to be blocked by the interrupt handler doing its
ping?

(3) Make rxrpc_new_client_call() unlock the socket after allocating a new
call, locking its user mutex and adding it to the socket's call tree.
The call is returned locked so that sendmsg() can add data to it
immediately.

From the moment the call is in the socket tree, it is subject to
access by sendmsg() and recvmsg() - even if it isn't connected yet.

(4) Lock new service calls in the UDP data_ready handler (in
rxrpc_new_incoming_call()) because they may already be in the socket's
tree and the data_ready handler makes them live immediately if a user
ID has already been preassigned.

Note that the new call is locked before any notifications are sent
that it is live, so doing mutex_trylock() *ought* to always succeed.
Userspace is prevented from doing sendmsg() on calls that are in a
too-early state in rxrpc_do_sendmsg().

(5) Make rxrpc_new_incoming_call() return the call with the user mutex
held so that a ping can be scheduled immediately under it.

Note that it might be worth moving the ping call into
rxrpc_new_incoming_call() and then we can drop the mutex there.

(6) Make rxrpc_accept_call() take the lock on the call it is accepting and
release the socket after adding the call to the socket's tree. This
is slightly tricky as we've dequeued the call by that point and have
to requeue it.

Note that requeuing emits a trace event.

(7) Make rxrpc_kernel_send_data() and rxrpc_kernel_recv_data() take the
new mutex immediately and don't bother with the socket mutex at all.

This patch has the nice bonus that calls on the same socket are now to some
extent parallelisable.

Note that we might want to move rxrpc_service_prealloc() calls out from the
socket lock and give it its own lock, so that we don't hang progress in
other calls because we're waiting for the allocator.

We probably also want to avoid calling rxrpc_notify_socket() from within
the socket lock (rxrpc_accept_call()).

Signed-off-by: David Howells
Tested-by: Marc Dionne
Signed-off-by: David S. Miller

David Howells
2017-03-02 01:50:58 +0800

01 Mar, 2017

4 commits

cf393195c Merge branch 'idr-4.11' of git://git.infradead.org/users/willy/linux-dax ... Browse Code »

Pull IDR rewrite from Matthew Wilcox:
"The most significant part of the following is the patch to rewrite the
IDR & IDA to be clients of the radix tree. But there's much more,
including an enhancement of the IDA to be significantly more space
efficient, an IDR & IDA test suite, some improvements to the IDR API
(and driver changes to take advantage of those improvements), several
improvements to the radix tree test suite and RCU annotations.

The IDR & IDA rewrite had a good spin in linux-next and Andrew's tree
for most of the last cycle. Coupled with the IDR test suite, I feel
pretty confident that any remaining bugs are quite hard to hit. 0-day
did a great job of watching my git tree and pointing out problems; as
it hit them, I added new test-cases to be sure not to be caught the
same way twice"

Willy goes on to expand a bit on the IDR rewrite rationale:
"The radix tree and the IDR use very similar data structures.

Merging the two codebases lets us share the memory allocation pools,
and results in a net deletion of 500 lines of code. It also opens up
the possibility of exposing more of the features of the radix tree to
users of the IDR (and I have some interesting patches along those
lines waiting for 4.12)

It also shrinks the size of the 'struct idr' from 40 bytes to 24 which
will shrink a fair few data structures that embed an IDR"

* 'idr-4.11' of git://git.infradead.org/users/willy/linux-dax: (32 commits)
radix tree test suite: Add config option for map shift
idr: Add missing __rcu annotations
radix-tree: Fix __rcu annotations
radix-tree: Add rcu_dereference and rcu_assign_pointer calls
radix tree test suite: Run iteration tests for longer
radix tree test suite: Fix split/join memory leaks
radix tree test suite: Fix leaks in regression2.c
radix tree test suite: Fix leaky tests
radix tree test suite: Enable address sanitizer
radix_tree_iter_resume: Fix out of bounds error
radix-tree: Store a pointer to the root in each node
radix-tree: Chain preallocated nodes through ->parent
radix tree test suite: Dial down verbosity with -v
radix tree test suite: Introduce kmalloc_verbose
idr: Return the deleted entry from idr_remove
radix tree test suite: Build separate binaries for some tests
ida: Use exceptional entries for small IDAs
ida: Move ida_bitmap to a percpu variable
Reimplement IDR and IDA using the radix tree
radix-tree: Add radix_tree_iter_delete
...

Linus Torvalds
2017-03-01 12:29:41 +0800
8313064c2 Merge tag 'nfsd-4.11' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"The nfsd update this round is mainly a lot of miscellaneous cleanups
and bugfixes.

A couple changes could theoretically break working setups on upgrade.
I don't expect complaints in practice, but they seem worth calling out
just in case:

- NFS security labels are now off by default; a new security_label
export flag reenables it per export. But, having them on by default
is a disaster, as it generally only makes sense if all your clients
and servers have similar enough selinux policies. Thanks to Jason
Tibbitts for pointing this out.

- NFSv4/UDP support is off. It was never really supported, and the
spec explicitly forbids it. We only ever left it on out of
laziness; thanks to Jeff Layton for finally fixing that"

* tag 'nfsd-4.11' of git://linux-nfs.org/~bfields/linux: (34 commits)
nfsd: Fix display of the version string
nfsd: fix configuration of supported minor versions
sunrpc: don't register UDP port with rpcbind when version needs congestion control
nfs/nfsd/sunrpc: enforce transport requirements for NFSv4
sunrpc: flag transports as having congestion control
sunrpc: turn bitfield flags in svc_version into bools
nfsd: remove superfluous KERN_INFO
nfsd: special case truncates some more
nfsd: minor nfsd_setattr cleanup
NFSD: Reserve adequate space for LOCKT operation
NFSD: Get response size before operation for all RPCs
nfsd/callback: Drop a useless data copy when comparing sessionid
nfsd/callback: skip the callback tag
nfsd/callback: Cleanup callback cred on shutdown
nfsd/idmap: return nfserr_inval for 0-length names
SUNRPC/Cache: Always treat the invalid cache as unexpired
SUNRPC: Drop all entries from cache_detail when cache_purge()
svcrdma: Poll CQs in "workqueue" mode
svcrdma: Combine list fields in struct svc_rdma_op_ctxt
svcrdma: Remove unused sc_dto_q field
...

Linus Torvalds
2017-03-01 07:39:09 +0800
b2deee2dc Merge tag 'ceph-for-4.11-rc1' of git://github.com/ceph/ceph-client ... Browse Code »

Pull ceph updates from Ilya Dryomov:
"This time around we have:

- support for rbd data-pool feature, which enables rbd images on
erasure-coded pools (myself). CEPH_PG_MAX_SIZE has been bumped to
allow erasure-coded profiles with k+m up to 32.

- a patch for ceph_d_revalidate() performance regression introduced
in 4.9, along with some cleanups in the area (Jeff Layton)

- a set of fixes for unsafe ->d_parent accesses in CephFS (Jeff
Layton)

- buffered reads are now processed in rsize windows instead of rasize
windows (Andreas Gerstmayr). The new default for rsize mount option
is 64M.

- ack vs commit distinction is gone, greatly simplifying ->fsync()
and MOSDOpReply handling code (myself)

... also a few filesystem bug fixes from Zheng, a CRUSH sync up (CRUSH
computations are still serialized though) and several minor fixes and
cleanups all over"

* tag 'ceph-for-4.11-rc1' of git://github.com/ceph/ceph-client: (52 commits)
libceph, rbd, ceph: WRITE | ONDISK -> WRITE
libceph: get rid of ack vs commit
ceph: remove special ack vs commit behavior
ceph: tidy some white space in get_nonsnap_parent()
crush: fix dprintk compilation
crush: do is_out test only if we do not collide
ceph: remove req from unsafe list when unregistering it
rbd: constify device_type structure
rbd: kill obj_request->object_name and rbd_segment_name_cache
rbd: store and use obj_request->object_no
rbd: RBD_V{1,2}_DATA_FORMAT macros
rbd: factor out __rbd_osd_req_create()
rbd: set offset and length outside of rbd_obj_request_create()
rbd: support for data-pool feature
rbd: introduce rbd_init_layout()
rbd: use rbd_obj_bytes() more
rbd: remove now unused rbd_obj_request_wait() and helpers
rbd: switch rbd_obj_method_sync() to ceph_osdc_call()
libceph: pass reply buffer length through ceph_osdc_call()
rbd: do away with obj_request in rbd_obj_read_sync()
...

Linus Torvalds
2017-03-01 07:36:09 +0800
c2eca00fe Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Don't save TIPC header values before the header has been validated,
from Jon Paul Maloy.

2) Fix memory leak in RDS, from Zhu Yanjun.

3) We miss to initialize the UID in the flow key in some paths, from
Julian Anastasov.

4) Fix latent TOS masking bug in the routing cache removal from years
ago, also from Julian.

5) We forget to set the sockaddr port in sctp_copy_local_addr_list(),
fix from Xin Long.

6) Missing module ref count drop in packet scheduler actions, from
Roman Mashak.

7) Fix RCU annotations in rht_bucket_nested, from Herbert Xu.

8) Fix use after free which happens because L2TP's ipv4 support returns
non-zero values from it's backlog_rcv function which ipv4 interprets
as protocol values. Fix from Paul Hüber.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits)
qed: Don't use attention PTT for configuring BW
qed: Fix race with multiple VFs
l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv
xfrm: provide correct dst in xfrm_neigh_lookup
rhashtable: Fix RCU dereference annotation in rht_bucket_nested
rhashtable: Fix use before NULL check in bucket_table_free
net sched actions: do not overwrite status of action creation.
rxrpc: Kernel calls get stuck in recvmsg
net sched actions: decrement module reference count after table flush.
lib: Allow compile-testing of parman
ipv6: check sk sk_type and protocol early in ip_mroute_set/getsockopt
sctp: set sin_port for addr param when checking duplicate address
net/mlx4_en: fix overflow in mlx4_en_init_timestamp()
netfilter: nft_set_bitmap: incorrect bitmap size
net: s2io: fix typo argumnet argument
net: vxge: fix typo argumnet argument
netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value.
ipv4: mask tos for input route
ipv4: add missing initialization for flowi4_uid
lib: fix spelling mistake: "actualy" -> "actually"
...

Linus Torvalds
2017-03-01 02:00:39 +0800

28 Feb, 2017

6 commits

19d19e960 mac80211: use driver-indicated transmitter STA only for data frames ... Browse Code »

When I originally introduced using the driver-indicated station as an
optimisation to avoid the hashtable lookup/iteration, of course it
wasn't intended to really functionally change anything.

I neglected, however, to take into account VLAN interfaces, which have
the property that management and data frames are handled differently:
data frames go directly to the station and the VLAN while management
frames continue to be processed over the underlying/associated AP-type
interface. As a consequence, when a driver used this optimisation for
management frames and the user enabled VLANs, my change broke things
since any management frames, particularly disassoc/deauth, were missed
by hostapd.

Fix this by restoring the original code path for non-data frames, they
aren't critical for performance to begin with.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=194713.

Big thanks goes to Jarek who bisected the issue and provided a very
detailed bug report, including the crucial information that he was
using VLANs in his configuration.

Cc: stable@vger.kernel.org
Fixes: 771e846bea9e ("mac80211: allow passing transmitter station on RX")
Reported-and-tested-by: Jarek Kamiński
Signed-off-by: Johannes Berg

Johannes Berg
2017-02-28 14:42:05 +0800
5b5e0928f lib/vsprintf.c: remove %Z support ... Browse Code »

Now that %z is standartised in C99 there is no reason to support %Z.
Unlike %L it doesn't even make format strings smaller.

Use BUILD_BUG_ON in a couple ATM drivers.

In case anyone didn't notice lib/vsprintf.o is about half of SLUB which
is in my opinion is quite an achievement. Hopefully this patch inspires
someone else to trim vsprintf.c more.

Link: http://lkml.kernel.org/r/20170103230126.GA30170@avx2
Signed-off-by: Alexey Dobriyan
Cc: Andy Shevchenko
Cc: Rasmus Villemoes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2017-02-28 10:43:47 +0800
b564d62e6 scripts/spelling.txt: add "varible" pattern and fix typo instances ... Browse Code »

Fix typos and add the following to the scripts/spelling.txt:

varible||variable

While we are here, tidy up the comment blocks that fit in a single line
for drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c and
net/sctp/transport.c.

Link: http://lkml.kernel.org/r/1481573103-11329-11-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Masahiro Yamada
2017-02-28 10:43:47 +0800
550116d21 scripts/spelling.txt: add "aligment" pattern and fix typo instances ... Browse Code »

Fix typos and add the following to the scripts/spelling.txt:

aligment||alignment

I did not touch the "N_BYTE_ALIGMENT" macro in
drivers/net/wireless/realtek/rtlwifi/wifi.h to avoid unpredictable
impact.

I fixed "_aligment_handler" in arch/openrisc/kernel/entry.S because
it is surrounded by #if 0 ... #endif. It is surely safe and I
confirmed "_alignment_handler" is correct.

I also fixed the "controler" I found in the same hunk in
arch/openrisc/kernel/head.S.

Link: http://lkml.kernel.org/r/1481573103-11329-8-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Masahiro Yamada
2017-02-28 10:43:46 +0800
9332ef9db scripts/spelling.txt: add "an user" pattern and fix typo instances ... Browse Code »

Fix typos and add the following to the scripts/spelling.txt:

an user||a user
an userspace||a userspace

I also added "userspace" to the list since it is a common word in Linux.
I found some instances for "an userfaultfd", but I did not add it to the
list. I felt it is endless to find words that start with "user" such as
"userland" etc., so must draw a line somewhere.

Link: http://lkml.kernel.org/r/1481573103-11329-4-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Masahiro Yamada
2017-02-28 10:43:46 +0800
08a7e621f scripts/spelling.txt: add "swith" pattern and fix typo instances ... Browse Code »

Fix typos and add the following to the scripts/spelling.txt:

swith||switch
swithable||switchable
swithed||switched
swithing||switching

While we are here, fix the "update" to "updates" in the touched hunk in
drivers/net/wireless/marvell/mwifiex/wmm.c.

Link: http://lkml.kernel.org/r/1481573103-11329-2-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Masahiro Yamada
2017-02-28 10:43:46 +0800

27 Feb, 2017

18 commits

4ca257eed Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains netfilter fixes for you net tree,
they are:

1) Missing ct zone size in the nft_ct initialization path, patch
from Florian Westphal.

2) Two patches for netfilter uapi headers, one to remove unnecessary
sysctl.h inclusion and another to fix compilation of xt_hashlimit.h
in userspace, from Dmitry V. Levin.

3) Patch to fix a sloppy change in nf_ct_expect that incorrectly
simplified nf_ct_expect_related_report() in the previous nf-next
batch. This also includes another patch for __nf_ct_expect_check()
to report success by returning 0 to keep it consistent with other
existing functions. From Jarno Rajahalme.

4) The ->walk() iterator of the new bitmap set type goes over the real
bitmap size, this results in incorrect dumps when NFTA_SET_USERDATA
is used.
====================

Signed-off-by: David S. Miller

David S. Miller
2017-02-27 22:17:43 +0800
09e0a2fe1 mac80211: fix typo in debug print ... Browse Code »

Signed-off-by: Sara Sharon
Signed-off-by: Johannes Berg

Sara Sharon
2017-02-27 21:09:49 +0800
2595d259b mac80211: shorten debug message ... Browse Code »

Tracing is limited to 100 characters and this message passes
the limit when there are a few buffered frames. Shorten it.

Signed-off-by: Sara Sharon
Signed-off-by: Johannes Berg

Sara Sharon
2017-02-27 21:09:26 +0800
d98937f4e mac80211: fix power saving clients handling in iwlwifi ... Browse Code »

iwlwifi now supports RSS and can't let mac80211 track the
PS state based on the Rx frames since they can come out of
order. iwlwifi is now advertising AP_LINK_PS, and uses
explicit notifications to teach mac80211 about the PS state
of the stations and the PS poll / uAPSD trigger frames
coming our way from the peers.

Because of that, the TIM stopped being maintained in
mac80211. I tried to fix this in commit c68df2e7be0c
("mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE")
but that was later reverted by Felix in commit 6c18a6b4e799
("Revert "mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE")
since it broke drivers that do not implement set_tim.

Since none of the drivers that set AP_LINK_PS have the
set_tim() handler set besides iwlwifi, I can bail out in
__sta_info_recalc_tim if AP_LINK_PS AND .set_tim is not
implemented.

Signed-off-by: Emmanuel Grumbach
Signed-off-by: Johannes Berg

Emmanuel Grumbach
2017-02-27 21:07:47 +0800
890030d3c mac80211: don't handle filtered frames within a BA session ... Browse Code »

When running a BA session, the driver (or the hardware) already takes
care of retransmitting failed frames, since it has to keep the receiver
reorder window in sync.

Adding another layer of retransmit around that does not improve
anything. In fact, it can only lead to some strong reordering with huge
latency.

Cc: stable@vger.kernel.org
Signed-off-by: Felix Fietkau
Signed-off-by: Johannes Berg

Felix Fietkau
2017-02-27 21:06:00 +0800
b7540d8f2 mac80211: don't reorder frames with SN smaller than SSN ... Browse Code »

When RX aggregation starts, transmitter may continue send frames
with SN smaller than SSN until the AddBA response is received.
However, the reorder buffer is already initialized at this point,
which will cause the drop of such frames as duplicates since the
head SN of the reorder buffer is set to the SSN, which is bigger.

Cc: stable@vger.kernel.org
Signed-off-by: Sara Sharon
Signed-off-by: Johannes Berg

Sara Sharon
2017-02-27 21:00:26 +0800
a9e9200d8 mac80211: flush delayed work when entering suspend ... Browse Code »

The issue was found when entering suspend and resume.
It triggers a warning in:
mac80211/key.c: ieee80211_enable_keys()
...
WARN_ON_ONCE(sdata->crypto_tx_tailroom_needed_cnt ||
sdata->crypto_tx_tailroom_pending_dec);
...

It points out sdata->crypto_tx_tailroom_pending_dec isn't cleaned up successfully
in a delayed_work during suspend. Add a flush_delayed_work to fix it.

Cc: stable@vger.kernel.org
Signed-off-by: Matt Chen
Signed-off-by: Johannes Berg

Matt Chen
2017-02-27 21:00:26 +0800
0328edc77 mac80211: fix packet statistics for fast-RX ... Browse Code »

When adding per-CPU statistics, which added statistics back
to mac80211 for the fast-RX path, I evidently forgot to add
the "stats->packets++" line. The reason for that is likely
that I didn't see it since it's done in defragmentation for
the regular RX path.

Add the missing line to properly count received packets in
the fast-RX case.

Fixes: c9c5962b56c1 ("mac80211: enable collecting station statistics per-CPU")
Reported-by: Oren Givon
Signed-off-by: Johannes Berg

Johannes Berg
2017-02-27 21:00:26 +0800
51fb60eb1 l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv ... Browse Code »

l2tp_ip_backlog_recv may not return -1 if the packet gets dropped.
The return value is passed up to ip_local_deliver_finish, which treats
negative values as an IP protocol number for resubmission.

Signed-off-by: Paul Hüber
Signed-off-by: David S. Miller

Paul Hüber
2017-02-27 11:57:17 +0800
1ecc9ad02 xfrm: provide correct dst in xfrm_neigh_lookup ... Browse Code »

Fix xfrm_neigh_lookup to provide dst->path to the
neigh_lookup dst_ops method.

When skb is provided, the IP address in packet should already
match the dst->path address family. But for the non-skb case,
we should consider the last tunnel address as nexthop address.

Fixes: f894cbf847c9 ("net: Add optional SKB arg to dst_ops->neigh_lookup().")
Signed-off-by: Julian Anastasov
Signed-off-by: David S. Miller

Julian Anastasov
2017-02-27 10:35:24 +0800
37f1c63e3 net sched actions: do not overwrite status of action creation. ... Browse Code »

nla_memdup_cookie was overwriting err value, declared at function
scope and earlier initialized with result of ->init(). At success
nla_memdup_cookie() returns 0, and thus module refcnt decremented,
although the action was installed.

$ sudo tc actions add action pass index 1 cookie 1234
$ sudo tc actions ls action gact

action order 0: gact action pass
random type none pass val 0
index 1 ref 1 bind 0
$
$ lsmod
Module Size Used by
act_gact 16384 0
...
$
$ sudo rmmod act_gact
[ 52.310283] ------------[ cut here ]------------
[ 52.312551] WARNING: CPU: 1 PID: 455 at kernel/module.c:1113
module_put+0x99/0xa0
[ 52.316278] Modules linked in: act_gact(-) crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel psmouse pcbc evbug aesni_intel aes_x86_64 crypto_simd
serio_raw glue_helper pcspkr cryptd
[ 52.322285] CPU: 1 PID: 455 Comm: rmmod Not tainted 4.10.0+ #11
[ 52.324261] Call Trace:
[ 52.325132] dump_stack+0x63/0x87
[ 52.326236] __warn+0xd1/0xf0
[ 52.326260] warn_slowpath_null+0x1d/0x20
[ 52.326260] module_put+0x99/0xa0
[ 52.326260] tcf_hashinfo_destroy+0x7f/0x90
[ 52.326260] gact_exit_net+0x27/0x40 [act_gact]
[ 52.326260] ops_exit_list.isra.6+0x38/0x60
[ 52.326260] unregister_pernet_operations+0x90/0xe0
[ 52.326260] unregister_pernet_subsys+0x21/0x30
[ 52.326260] tcf_unregister_action+0x68/0xa0
[ 52.326260] gact_cleanup_module+0x17/0xa0f [act_gact]
[ 52.326260] SyS_delete_module+0x1ba/0x220
[ 52.326260] entry_SYSCALL_64_fastpath+0x1e/0xad
[ 52.326260] RIP: 0033:0x7f527ffae367
[ 52.326260] RSP: 002b:00007ffeb402a598 EFLAGS: 00000202 ORIG_RAX:
00000000000000b0
[ 52.326260] RAX: ffffffffffffffda RBX: 0000559b069912a0 RCX: 00007f527ffae367
[ 52.326260] RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559b06991308
[ 52.326260] RBP: 0000000000000003 R08: 00007f5280264420 R09: 00007ffeb4029511
[ 52.326260] R10: 000000000000087b R11: 0000000000000202 R12: 00007ffeb4029580
[ 52.326260] R13: 0000000000000000 R14: 0000000000000000 R15: 0000559b069912a0
[ 52.354856] ---[ end trace 90d89401542b0db6 ]---
$

With the fix:

$ sudo modprobe act_gact
$ lsmod
Module Size Used by
act_gact 16384 0
...
$ sudo tc actions add action pass index 1 cookie 1234
$ sudo tc actions ls action gact

action order 0: gact action pass
random type none pass val 0
index 1 ref 1 bind 0
$
$ lsmod
Module Size Used by
act_gact 16384 1
...
$ sudo rmmod act_gact
rmmod: ERROR: Module act_gact is in use
$
$ sudo /home/mrv/bin/tc actions del action gact index 1
$ sudo rmmod act_gact
$ lsmod
Module Size Used by
$

Fixes: 1045ba77a ("net sched actions: Add support for user cookies")
Signed-off-by: Roman Mashak
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller

Roman Mashak
2017-02-27 10:31:32 +0800
d7e15835a rxrpc: Kernel calls get stuck in recvmsg ... Browse Code »

Calls made through the in-kernel interface can end up getting stuck because
of a missed variable update in a loop in rxrpc_recvmsg_data(). The problem
is like this:

(1) A new packet comes in and doesn't cause a notification to be given to
the client as there's still another packet in the ring - the
assumption being that if the client will keep drawing off data until
the ring is empty.

(2) The client is in rxrpc_recvmsg_data(), inside the big while loop that
iterates through the packets. This copies the window pointers into
variables rather than using the information in the call struct
because:

(a) MSG_PEEK might be in effect;

(b) we need a barrier after reading call->rx_top to pair with the
barrier in the softirq routine that loads the buffer.

(3) The reading of call->rx_top is done outside of the loop, and top is
never updated whilst we're in the loop. This means that even through
there's a new packet available, we don't see it and may return -EFAULT
to the caller - who will happily return to the scheduler and await the
next notification.

(4) No further notifications are forthcoming until there's an abort as the
ring isn't empty.

The fix is to move the read of call->rx_top inside the loop - but it needs
to be done before the condition is checked.

Reported-by: Marc Dionne
Signed-off-by: David Howells
Tested-by: Marc Dionne
Signed-off-by: David S. Miller

David Howells
2017-02-27 10:30:12 +0800
edb9d1bff net sched actions: decrement module reference count after table flush. ... Browse Code »

When tc actions are loaded as a module and no actions have been installed,
flushing them would result in actions removed from the memory, but modules
reference count not being decremented, so that the modules would not be
unloaded.

Following is example with GACT action:

% sudo modprobe act_gact
% lsmod
Module Size Used by
act_gact 16384 0
%
% sudo tc actions ls action gact
%
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 1
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 2
% sudo rmmod act_gact
rmmod: ERROR: Module act_gact is in use
....

After the fix:
% lsmod
Module Size Used by
act_gact 16384 0
%
% sudo tc actions add action pass index 1
% sudo tc actions add action pass index 2
% sudo tc actions add action pass index 3
% lsmod
Module Size Used by
act_gact 16384 3
%
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 0
%
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 0
% sudo rmmod act_gact
% lsmod
Module Size Used by
%

Fixes: f97017cdefef ("net-sched: Fix actions flushing")
Signed-off-by: Roman Mashak
Signed-off-by: Jamal Hadi Salim
Acked-by: Cong Wang
Signed-off-by: David S. Miller

Roman Mashak
2017-02-27 10:28:41 +0800
99253eb75 ipv6: check sk sk_type and protocol early in ip_mroute_set/getsockopt ... Browse Code »

Commit 5e1859fbcc3c ("ipv4: ipmr: various fixes and cleanups") fixed
the issue for ipv4 ipmr:

ip_mroute_setsockopt() & ip_mroute_getsockopt() should not
access/set raw_sk(sk)->ipmr_table before making sure the socket
is a raw socket, and protocol is IGMP

The same fix should be done for ipv6 ipmr as well.

This patch can fix the panic caused by overwriting the same offset
as ipmr_table as in raw_sk(sk) when accessing other type's socket
by ip_mroute_setsockopt().

Signed-off-by: Xin Long
Signed-off-by: David S. Miller

Xin Long
2017-02-27 10:25:46 +0800
2e3ce5bc2 sctp: set sin_port for addr param when checking duplicate address ... Browse Code »

Commit b8607805dd15 ("sctp: not copying duplicate addrs to the assoc's
bind address list") tried to check for duplicate address before copying
to asoc's bind_addr list from global addr list.

But all the addrs' sin_ports in global addr list are 0 while the addrs'
sin_ports are bp->port in asoc's bind_addr list. It means even if it's
a duplicate address, af->cmp_addr will still return 0 as the their
sin_ports are different.

This patch is to fix it by setting the sin_port for addr param with
bp->port before comparing the addrs.

Fixes: b8607805dd15 ("sctp: not copying duplicate addrs to the assoc's bind address list")
Reported-by: Wei Chen
Signed-off-by: Xin Long
Signed-off-by: David S. Miller

Xin Long
2017-02-27 10:24:05 +0800
13aa5a8f4 netfilter: nft_set_bitmap: incorrect bitmap size ... Browse Code »

priv->bitmap_size stores the real bitmap size, instead of the full
struct nft_bitmap object.

Fixes: 665153ff5752 ("netfilter: nf_tables: add bitmap set type")
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2017-02-27 04:00:19 +0800
4b86c459c netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value. ... Browse Code »

Commit 4dee62b1b9b4 ("netfilter: nf_ct_expect: nf_ct_expect_insert()
returns void") inadvertently changed the successful return value of
nf_ct_expect_related_report() from 0 to 1 due to
__nf_ct_expect_check() returning 1 on success. Prevent this
regression in the future by changing the return value of
__nf_ct_expect_check() to 0 on success.

Signed-off-by: Jarno Rajahalme
Acked-by: Joe Stringer
Signed-off-by: Pablo Neira Ayuso

Jarno Rajahalme
2017-02-27 00:06:59 +0800
6e28099d3 ipv4: mask tos for input route ... Browse Code »

Restore the lost masking of TOS in input route code to
allow ip rules to match it properly.

Problem [1] noticed by Shmulik Ladkani

[1] http://marc.info/?t=137331755300040&r=1&w=2

Fixes: 89aef8921bfb ("ipv4: Delete routing cache.")
Signed-off-by: Julian Anastasov
Signed-off-by: David S. Miller

Julian Anastasov
2017-02-27 00:03:38 +0800