Eric Lee / smarc-fsl-linux-kernel

04 Feb, 2017

14 commits

ad864d9fc net: mpls: Fix multipath selection for LSR use case ... Browse Code »

[ Upstream commit 9f427a0e474a67b454420c131709600d44850486 ]

MPLS multipath for LSR is broken -- always selecting the first nexthop
in the one label case. For example:

$ ip -f mpls ro ls
100
nexthop as to 200 via inet 172.16.2.2 dev virt12
nexthop as to 300 via inet 172.16.3.2 dev virt13
101
nexthop as to 201 via inet6 2000:2::2 dev virt12
nexthop as to 301 via inet6 2000:3::2 dev virt13

In this example incoming packets have a single MPLS labels which means
BOS bit is set. The BOS bit is passed from mpls_forward down to
mpls_multipath_hash which never processes the hash loop because BOS is 1.

Update mpls_multipath_hash to process the entire label stack. mpls_hdr_len
tracks the total mpls header length on each pass (on pass N mpls_hdr_len
is N * sizeof(mpls_shim_hdr)). When the label is found with the BOS set
it verifies the skb has sufficient header for ipv4 or ipv6, and find the
IPv4 and IPv6 header by using the last mpls_hdr pointer and adding 1 to
advance past it.

With these changes I have verified the code correctly sees the label,
BOS, IPv4 and IPv6 addresses in the network header and icmp/tcp/udp
traffic for ipv4 and ipv6 are distributed across the nexthops.

Fixes: 1c78efa8319ca ("mpls: flow-based multipath selection")
Acked-by: Robert Shearman
Signed-off-by: David Ahern
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

David Ahern
2017-02-04 16:47:10 +0800
74423145d bridge: netlink: call br_changelink() during br_dev_newlink() ... Browse Code »

[ Upstream commit b6677449dff674cf5b81429b11d5c7f358852ef9 ]

Any bridge options specified during link creation (e.g. ip link add)
are ignored as br_dev_newlink() does not process them.
Use br_changelink() to do it.

Fixes: 133235161721 ("bridge: implement rtnl_link_ops->changelink")
Signed-off-by: Ivan Vecera
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Ivan Vecera
2017-02-04 16:47:10 +0800
0c687a735 tcp: initialize max window for a new fastopen socket ... Browse Code »

[ Upstream commit 0dbd7ff3ac5017a46033a9d0a87a8267d69119d9 ]

Found that if we run LTP netstress test with large MSS (65K),
the first attempt from server to send data comparable to this
MSS on fastopen connection will be delayed by the probe timer.

Here is an example:

< S seq 0:0 win 43690 options [mss 65495 wscale 7 tfo cookie] length 32
> S. seq 0:0 ack 1 win 43690 options [mss 65495 wscale 7] length 0
< . ack 1 win 342 length 0

Inside tcp_sendmsg(), tcp_send_mss() returns max MSS in 'mss_now',
as well as in 'size_goal'. This results the segment not queued for
transmition until all the data copied from user buffer. Then, inside
__tcp_push_pending_frames(), it breaks on send window test and
continues with the check probe timer.

Fragmentation occurs in tcp_write_wakeup()...

+0.2 > P. seq 1:43777 ack 1 win 342 length 43776
< . ack 43777, win 1365 length 0
> P. seq 43777:65001 ack 1 win 342 options [...] length 21224
...

This also contradicts with the fact that we should bound to the half
of the window if it is large.

Fix this flaw by correctly initializing max_window. Before that, it
could have large values that affect further calculations of 'size_goal'.

Fixes: 168a8f58059a ("tcp: TCP Fast Open Server - main code path")
Signed-off-by: Alexey Kodanev
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Alexey Kodanev
2017-02-04 16:47:10 +0800
79453ab88 ipv6: addrconf: Avoid addrconf_disable_change() using RCU read-side lock ... Browse Code »

[ Upstream commit 03e4deff4987f79c34112c5ba4eb195d4f9382b0 ]

Just like commit 4acd4945cd1e ("ipv6: addrconf: Avoid calling
netdevice notifiers with RCU read-side lock"), it is unnecessary
to make addrconf_disable_change() use RCU iteration over the
netdev list, since it already holds the RTNL lock, or we may meet
Illegal context switch in RCU read-side critical section.

Signed-off-by: Kefeng Wang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Kefeng Wang
2017-02-04 16:47:10 +0800
e9db042dc lwtunnel: fix autoload of lwt modules ... Browse Code »

[ Upstream commit 9ed59592e3e379b2e9557dc1d9e9ec8fcbb33f16]

Trying to add an mpls encap route when the MPLS modules are not loaded
hangs. For example:

CONFIG_MPLS=y
CONFIG_NET_MPLS_GSO=m
CONFIG_MPLS_ROUTING=m
CONFIG_MPLS_IPTUNNEL=m

$ ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2

The ip command hangs:
root 880 826 0 21:25 pts/0 00:00:00 ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2

$ cat /proc/880/stack
[] call_usermodehelper_exec+0xd6/0x134
[] __request_module+0x27b/0x30a
[] lwtunnel_build_state+0xe4/0x178
[] fib_create_info+0x47f/0xdd4
[] fib_table_insert+0x90/0x41f
[] inet_rtm_newroute+0x4b/0x52
...

modprobe is trying to load rtnl-lwt-MPLS:

root 881 5 0 21:25 ? 00:00:00 /sbin/modprobe -q -- rtnl-lwt-MPLS

and it hangs after loading mpls_router:

$ cat /proc/881/stack
[] rtnl_lock+0x12/0x14
[] register_netdevice_notifier+0x16/0x179
[] mpls_init+0x25/0x1000 [mpls_router]
[] do_one_initcall+0x8e/0x13f
[] do_init_module+0x5a/0x1e5
[] load_module+0x13bd/0x17d6
...

The problem is that lwtunnel_build_state is called with rtnl lock
held preventing mpls_init from registering.

Given the potential references held by the time lwtunnel_build_state it
can not drop the rtnl lock to the load module. So, extract the module
loading code from lwtunnel_build_state into a new function to validate
the encap type. The new function is called while converting the user
request into a fib_config which is well before any table, device or
fib entries are examined.

Fixes: 745041e2aaf1 ("lwtunnel: autoload of lwt modules")
Signed-off-by: David Ahern
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

David Ahern
2017-02-04 16:47:10 +0800
948e137ad net: fix harmonize_features() vs NETIF_F_HIGHDMA ... Browse Code »

[ Upstream commit 7be2c82cfd5d28d7adb66821a992604eb6dd112e ]

Ashizuka reported a highmem oddity and sent a patch for freescale
fec driver.

But the problem root cause is that core networking stack
must ensure no skb with highmem fragment is ever sent through
a device that does not assert NETIF_F_HIGHDMA in its features.

We need to call illegal_highdma() from harmonize_features()
regardless of CSUM checks.

Fixes: ec5f06156423 ("net: Kill link between CSUM and SG features.")
Signed-off-by: Eric Dumazet
Cc: Pravin Shelar
Reported-by: "Ashizuka, Yuusuke"
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2017-02-04 16:47:10 +0800
1e7cbb413 virtio-net: restore VIRTIO_HDR_F_DATA_VALID on receiving ... Browse Code »

[ Upstream commit 6391a4481ba0796805d6581e42f9f0418c099e34 ]

Commit 501db511397f ("virtio: don't set VIRTIO_NET_HDR_F_DATA_VALID on
xmit") in fact disables VIRTIO_HDR_F_DATA_VALID on receiving path too,
fixing this by adding a hint (has_data_valid) and set it only on the
receiving path.

Cc: Rolf Neugebauer
Signed-off-by: Jason Wang
Acked-by: Rolf Neugebauer
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Jason Wang
2017-02-04 16:47:09 +0800
b260a714a net sched actions: fix refcnt when GETing of action after bind ... Browse Code »

[ Upstream commit 0faa9cb5b3836a979864a6357e01d2046884ad52 ]

Demonstrating the issue:

.. add a drop action
$sudo $TC actions add action drop index 10

.. retrieve it
$ sudo $TC -s actions get action gact index 10

action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 0 installed 29 sec used 29 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

... bug 1 above: reference is two.
Reference is actually 1 but we forget to subtract 1.

... do a GET again and we see the same issue
try a few times and nothing changes
~$ sudo $TC -s actions get action gact index 10

action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 0 installed 31 sec used 31 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

... lets try to bind the action to a filter..
$ sudo $TC qdisc add dev lo ingress
$ sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \
u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 10

... and now a few GETs:
$ sudo $TC -s actions get action gact index 10

action order 1: gact action drop
random type none pass val 0
index 10 ref 3 bind 1 installed 204 sec used 204 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

$ sudo $TC -s actions get action gact index 10

action order 1: gact action drop
random type none pass val 0
index 10 ref 4 bind 1 installed 206 sec used 206 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

$ sudo $TC -s actions get action gact index 10

action order 1: gact action drop
random type none pass val 0
index 10 ref 5 bind 1 installed 235 sec used 235 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

.... as can be observed the reference count keeps going up.

After the fix

$ sudo $TC actions add action drop index 10
$ sudo $TC -s actions get action gact index 10

action order 1: gact action drop
random type none pass val 0
index 10 ref 1 bind 0 installed 4 sec used 4 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

$ sudo $TC -s actions get action gact index 10

action order 1: gact action drop
random type none pass val 0
index 10 ref 1 bind 0 installed 6 sec used 6 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

$ sudo $TC qdisc add dev lo ingress
$ sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \
u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 10

$ sudo $TC -s actions get action gact index 10

action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 1 installed 32 sec used 32 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

$ sudo $TC -s actions get action gact index 10

action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 1 installed 33 sec used 33 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

Fixes: aecc5cefc389 ("net sched actions: fix GETing actions")
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Jamal Hadi Salim
2017-02-04 16:47:09 +0800
2d6b61ec9 ax25: Fix segfault after sock connection timeout ... Browse Code »

[ Upstream commit 8a367e74c0120ef68c8c70d5a025648c96626dff ]

The ax.25 socket connection timed out & the sock struct has been
previously taken down ie. sock struct is now a NULL pointer. Checking
the sock_flag causes the segfault. Check if the socket struct pointer
is NULL before checking sock_flag. This segfault is seen in
timed out netrom connections.

Please submit to -stable.

Signed-off-by: Basil Gunn
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Basil Gunn
2017-02-04 16:47:09 +0800
c7a5df92d ip6_tunnel: Account for tunnel header in tunnel MTU ... Browse Code »

[ Upstream commit 02ca0423fd65a0a9c4d70da0dbb8f4b8503f08c7 ]

With ip6gre we have a tunnel header which also makes the tunnel MTU
smaller. We need to reserve room for it. Previously we were using up
space reserved for the Tunnel Encapsulation Limit option
header (RFC 2473).

Also, after commit b05229f44228 ("gre6: Cleanup GREv6 transmit path,
call common GRE functions") our contract with the caller has
changed. Now we check if the packet length exceeds the tunnel MTU after
the tunnel header has been pushed, unlike before.

This is reflected in the check where we look at the packet length minus
the size of the tunnel header, which is already accounted for in tunnel
MTU.

Fixes: b05229f44228 ("gre6: Cleanup GREv6 transmit path, call common GRE functions")
Signed-off-by: Jakub Sitnicki
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Jakub Sitnicki
2017-02-04 16:47:09 +0800
18767acb7 openvswitch: maintain correct checksum state in conntrack actions ... Browse Code »

[ Upstream commit 75f01a4c9cc291ff5cb28ca1216adb163b7a20ee ]

When executing conntrack actions on skbuffs with checksum mode
CHECKSUM_COMPLETE, the checksum must be updated to account for
header pushes and pulls. Otherwise we get "hw csum failure"
logs similar to this (ICMP packet received on geneve tunnel
via ixgbe NIC):

[ 405.740065] genev_sys_6081: hw csum failure
[ 405.740106] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G I 4.10.0-rc3+ #1
[ 405.740108] Call Trace:
[ 405.740110]
[ 405.740113] dump_stack+0x63/0x87
[ 405.740116] netdev_rx_csum_fault+0x3a/0x40
[ 405.740118] __skb_checksum_complete+0xcf/0xe0
[ 405.740120] nf_ip_checksum+0xc8/0xf0
[ 405.740124] icmp_error+0x1de/0x351 [nf_conntrack_ipv4]
[ 405.740132] nf_conntrack_in+0xe1/0x550 [nf_conntrack]
[ 405.740137] ? find_bucket.isra.2+0x62/0x70 [openvswitch]
[ 405.740143] __ovs_ct_lookup+0x95/0x980 [openvswitch]
[ 405.740145] ? netif_rx_internal+0x44/0x110
[ 405.740149] ovs_ct_execute+0x147/0x4b0 [openvswitch]
[ 405.740153] do_execute_actions+0x22e/0xa70 [openvswitch]
[ 405.740157] ovs_execute_actions+0x40/0x120 [openvswitch]
[ 405.740161] ovs_dp_process_packet+0x84/0x120 [openvswitch]
[ 405.740166] ovs_vport_receive+0x73/0xd0 [openvswitch]
[ 405.740168] ? udp_rcv+0x1a/0x20
[ 405.740170] ? ip_local_deliver_finish+0x93/0x1e0
[ 405.740172] ? ip_local_deliver+0x6f/0xe0
[ 405.740174] ? ip_rcv_finish+0x3a0/0x3a0
[ 405.740176] ? ip_rcv_finish+0xdb/0x3a0
[ 405.740177] ? ip_rcv+0x2a7/0x400
[ 405.740180] ? __netif_receive_skb_core+0x970/0xa00
[ 405.740185] netdev_frame_hook+0xd3/0x160 [openvswitch]
[ 405.740187] __netif_receive_skb_core+0x1dc/0xa00
[ 405.740194] ? ixgbe_clean_rx_irq+0x46d/0xa20 [ixgbe]
[ 405.740197] __netif_receive_skb+0x18/0x60
[ 405.740199] netif_receive_skb_internal+0x40/0xb0
[ 405.740201] napi_gro_receive+0xcd/0x120
[ 405.740204] gro_cell_poll+0x57/0x80 [geneve]
[ 405.740206] net_rx_action+0x260/0x3c0
[ 405.740209] __do_softirq+0xc9/0x28c
[ 405.740211] irq_exit+0xd9/0xf0
[ 405.740213] do_IRQ+0x51/0xd0
[ 405.740215] common_interrupt+0x93/0x93

Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Signed-off-by: Lance Richardson
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Lance Richardson
2017-02-04 16:47:09 +0800
3524f6422 tcp: fix tcp_fastopen unaligned access complaints on sparc ... Browse Code »

[ Upstream commit 003c941057eaa868ca6fedd29a274c863167230d ]

Fix up a data alignment issue on sparc by swapping the order
of the cookie byte array field with the length field in
struct tcp_fastopen_cookie, and making it a proper union
to clean up the typecasting.

This addresses log complaints like these:
log_unaligned: 113 callbacks suppressed
Kernel unaligned access at TPC[976490] tcp_try_fastopen+0x2d0/0x360
Kernel unaligned access at TPC[9764ac] tcp_try_fastopen+0x2ec/0x360
Kernel unaligned access at TPC[9764c8] tcp_try_fastopen+0x308/0x360
Kernel unaligned access at TPC[9764e4] tcp_try_fastopen+0x324/0x360
Kernel unaligned access at TPC[976490] tcp_try_fastopen+0x2d0/0x360

Cc: Eric Dumazet
Signed-off-by: Shannon Nelson
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Shannon Nelson
2017-02-04 16:47:09 +0800
958bb1bdc net: ipv4: fix table id in getroute response ... Browse Code »

[ Upstream commit 8a430ed50bb1b19ca14a46661f3b1b35f2fb5c39 ]

rtm_table is an 8-bit field while table ids are allowed up to u32. Commit
709772e6e065 ("net: Fix routing tables with id > 255 for legacy software")
added the preference to set rtm_table in dumps to RT_TABLE_COMPAT if the
table id is > 255. The table id returned on get route requests should do
the same.

Fixes: c36ba6603a11 ("net: Allow user to get table id from route lookup")
Signed-off-by: David Ahern
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

David Ahern
2017-02-04 16:47:08 +0800
6980c52c4 net: lwtunnel: Handle lwtunnel_fill_encap failure ... Browse Code »

[ Upstream commit ea7a80858f57d8878b1499ea0f1b8a635cc48de7 ]

Handle failure in lwtunnel_fill_encap adding attributes to skb.

Fixes: 571e722676fe ("ipv4: support for fib route lwtunnel encap attributes")
Fixes: 19e42e451506 ("ipv6: support for fib route lwtunnel encap attributes")
Signed-off-by: David Ahern
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

David Ahern
2017-02-04 16:47:08 +0800

01 Feb, 2017

1 commit

cb1d48f55 SUNRPC: cleanup ida information when removing sunrpc module ... Browse Code »

commit c929ea0b910355e1876c64431f3d5802f95b3d75 upstream.

After removing sunrpc module, I get many kmemleak information as,
unreferenced object 0xffff88003316b1e0 (size 544):
comm "gssproxy", pid 2148, jiffies 4294794465 (age 4200.081s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[] kmemleak_alloc+0x4a/0xa0
[] kmem_cache_alloc+0x15e/0x1f0
[] ida_pre_get+0xaa/0x150
[] ida_simple_get+0xad/0x180
[] nlmsvc_lookup_host+0x4ab/0x7f0 [lockd]
[] lockd+0x4d/0x270 [lockd]
[] param_set_timeout+0x55/0x100 [lockd]
[] svc_defer+0x114/0x3f0 [sunrpc]
[] svc_defer+0x2d7/0x3f0 [sunrpc]
[] rpc_show_info+0x8a/0x110 [sunrpc]
[] proc_reg_write+0x7f/0xc0
[] __vfs_write+0xdf/0x3c0
[] vfs_write+0xef/0x240
[] SyS_write+0xad/0x130
[] entry_SYSCALL_64_fastpath+0x1a/0xa9
[] 0xffffffffffffffff

I found, the ida information (dynamic memory) isn't cleanup.

Signed-off-by: Kinglong Mee
Fixes: 2f048db4680a ("SUNRPC: Add an identifier for struct rpc_clnt")
Signed-off-by: Trond Myklebust
Signed-off-by: Greg Kroah-Hartman

Kinglong Mee
2017-02-01 15:33:09 +0800

26 Jan, 2017

18 commits

f77ef5348 libceph: stop allocating a new cipher on every crypto request ... Browse Code »

commit 7af3ea189a9a13f090de51c97f676215dabc1205 upstream.

This is useless and more importantly not allowed on the writeback path,
because crypto_alloc_skcipher() allocates memory with GFP_KERNEL, which
can recurse back into the filesystem:

kworker/9:3 D ffff92303f318180 0 20732 2 0x00000080
Workqueue: ceph-msgr ceph_con_workfn [libceph]
ffff923035dd4480 ffff923038f8a0c0 0000000000000001 000000009eb27318
ffff92269eb28000 ffff92269eb27338 ffff923036b145ac ffff923035dd4480
00000000ffffffff ffff923036b145b0 ffffffff951eb4e1 ffff923036b145a8
Call Trace:
[] ? schedule+0x31/0x80
[] ? schedule_preempt_disabled+0xa/0x10
[] ? __mutex_lock_slowpath+0xb4/0x130
[] ? mutex_lock+0x1b/0x30
[] ? xfs_reclaim_inodes_ag+0x233/0x2d0 [xfs]
[] ? move_active_pages_to_lru+0x125/0x270
[] ? radix_tree_gang_lookup_tag+0xc5/0x1c0
[] ? __list_lru_walk_one.isra.3+0x33/0x120
[] ? xfs_reclaim_inodes_nr+0x31/0x40 [xfs]
[] ? super_cache_scan+0x17e/0x190
[] ? shrink_slab.part.38+0x1e3/0x3d0
[] ? shrink_node+0x10a/0x320
[] ? do_try_to_free_pages+0xf4/0x350
[] ? try_to_free_pages+0xea/0x1b0
[] ? __alloc_pages_nodemask+0x61d/0xe60
[] ? cache_grow_begin+0x9d/0x560
[] ? fallback_alloc+0x148/0x1c0
[] ? __crypto_alloc_tfm+0x37/0x130
[] ? __kmalloc+0x1eb/0x580
[] ? crush_choose_firstn+0x3eb/0x470 [libceph]
[] ? __crypto_alloc_tfm+0x37/0x130
[] ? crypto_spawn_tfm+0x39/0x60
[] ? crypto_cbc_init_tfm+0x23/0x40 [cbc]
[] ? __crypto_alloc_tfm+0xcc/0x130
[] ? crypto_skcipher_init_tfm+0x113/0x180
[] ? crypto_create_tfm+0x43/0xb0
[] ? crypto_larval_lookup+0x150/0x150
[] ? crypto_alloc_tfm+0x72/0x120
[] ? ceph_aes_encrypt2+0x67/0x400 [libceph]
[] ? ceph_pg_to_up_acting_osds+0x84/0x5b0 [libceph]
[] ? release_sock+0x40/0x90
[] ? tcp_recvmsg+0x4b4/0xae0
[] ? ceph_encrypt2+0x54/0xc0 [libceph]
[] ? ceph_x_encrypt+0x5d/0x90 [libceph]
[] ? calcu_signature+0x5f/0x90 [libceph]
[] ? ceph_x_sign_message+0x35/0x50 [libceph]
[] ? prepare_write_message_footer+0x5c/0xa0 [libceph]
[] ? ceph_con_workfn+0x2258/0x2dd0 [libceph]
[] ? queue_con_delay+0x33/0xd0 [libceph]
[] ? __submit_request+0x20d/0x2f0 [libceph]
[] ? ceph_osdc_start_request+0x28/0x30 [libceph]
[] ? rbd_queue_workfn+0x2f3/0x350 [rbd]
[] ? process_one_work+0x160/0x410
[] ? worker_thread+0x4d/0x480
[] ? process_one_work+0x410/0x410
[] ? kthread+0xcd/0xf0
[] ? ret_from_fork+0x1f/0x40
[] ? kthread_create_on_node+0x190/0x190

Allocating the cipher along with the key fixes the issue - as long the
key doesn't change, a single cipher context can be used concurrently in
multiple requests.

We still can't take that GFP_KERNEL allocation though. Both
ceph_crypto_key_clone() and ceph_crypto_key_decode() are called from
GFP_NOFS context, so resort to memalloc_noio_{save,restore}() here.

Reported-by: Lucas Stach
Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2017-01-26 15:24:46 +0800
5b482bf58 libceph: uninline ceph_crypto_key_destroy() ... Browse Code »

commit 6db2304aabb070261ad34923bfd83c43dfb000e3 upstream.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2017-01-26 15:24:46 +0800
d34b6684e xprtrdma: Squelch "max send, max recv" messages at connect time ... Browse Code »

commit 6d6bf72de914059b304f7b99530a7856e5c846aa upstream.

Clean up: This message was intended to be a dprintk, as it is on the
server-side.

Fixes: 87cfb9a0c85c ('xprtrdma: Client-side support for ...')
Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker
Signed-off-by: Greg Kroah-Hartman

Chuck Lever
2017-01-26 15:24:43 +0800
8ade1c2b4 xprtrdma: Make FRWR send queue entry accounting more accurate ... Browse Code »

commit 8d38de65644d900199f035277aa5f3da4aa9fc17 upstream.

Verbs providers may perform house-keeping on the Send Queue during
each signaled send completion. It is necessary therefore for a verbs
consumer (like xprtrdma) to occasionally force a signaled send
completion if it runs unsignaled most of the time.

xprtrdma does not require signaled completions for Send or FastReg
Work Requests, but does signal some LocalInv Work Requests. To
ensure that Send Queue house-keeping can run before the Send Queue
is more than half-consumed, xprtrdma forces a signaled completion
on occasion by counting the number of Send Queue Entries it
consumes. It currently does this by counting each ib_post_send as
one Entry.

Commit c9918ff56dfb ("xprtrdma: Add ro_unmap_sync method for FRWR")
introduced the ability for frwr_op_unmap_sync to post more than one
Work Request with a single post_send. Thus the underlying assumption
of one Send Queue Entry per ib_post_send is no longer true.

Also, FastReg Work Requests are currently never signaled. They
should be signaled once in a while, just as Send is, to keep the
accounting of consumed SQEs accurate.

While we're here, convert the CQCOUNT macros to the currently
preferred kernel coding style, which is inline functions.

Fixes: c9918ff56dfb ("xprtrdma: Add ro_unmap_sync method for FRWR")
Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker
Signed-off-by: Greg Kroah-Hartman

Chuck Lever
2017-01-26 15:24:43 +0800
a193c7247 libceph: make sure ceph_aes_crypt() IV is aligned ... Browse Code »

commit 124f930b8cbc4ac11236e6eb1c5f008318864588 upstream.

... otherwise the crypto stack will align it for us with a GFP_ATOMIC
allocation and a memcpy() -- see skcipher_walk_first().

Signed-off-by: Ilya Dryomov
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2017-01-26 15:24:43 +0800
b8add6715 libceph: remove now unused ceph_*{en,de}crypt*() functions ... Browse Code »

commit 2b1e1a7cd0a615d57455567a549f9965023321b5 upstream.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2017-01-26 15:24:42 +0800
2982b9c92 libceph: switch ceph_x_decrypt() to ceph_crypt() ... Browse Code »

commit e15fd0a11db00fc7f470a9fc804657ec3f6d04a5 upstream.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2017-01-26 15:24:42 +0800
717a145bd libceph: switch ceph_x_encrypt() to ceph_crypt() ... Browse Code »

commit d03857c63bb036edff0aa7a107276360173aca4e upstream.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2017-01-26 15:24:41 +0800
6e371f9a4 libceph: tweak calcu_signature() a little ... Browse Code »

commit 4eb4517ce7c9c573b6c823de403aeccb40018cfc upstream.

- replace an ad-hoc array with a struct
- rename to calc_signature() for consistency

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2017-01-26 15:24:41 +0800
788a0bbc7 libceph: rename and align ceph_x_authorizer::reply_buf ... Browse Code »

commit 7882a26d2e2e520099e2961d5e2e870f8e4172dc upstream.

It's going to be used as a temporary buffer for in-place en/decryption
with ceph_crypt() instead of on-stack buffers, so rename to enc_buf.
Ensure alignment to avoid GFP_ATOMIC allocations in the crypto stack.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2017-01-26 15:24:41 +0800
ecf7ced85 libceph: introduce ceph_crypt() for in-place en/decryption ... Browse Code »

commit a45f795c65b479b4ba107b6ccde29b896d51ee98 upstream.

Starting with 4.9, kernel stacks may be vmalloced and therefore not
guaranteed to be physically contiguous; the new CONFIG_VMAP_STACK
option is enabled by default on x86. This makes it invalid to use
on-stack buffers with the crypto scatterlist API, as sg_set_buf()
expects a logical address and won't work with vmalloced addresses.

There isn't a different (e.g. kvec-based) crypto API we could switch
net/ceph/crypto.c to and the current scatterlist.h API isn't getting
updated to accommodate this use case. Allocating a new header and
padding for each operation is a non-starter, so do the en/decryption
in-place on a single pre-assembled (header + data + padding) heap
buffer. This is explicitly supported by the crypto API:

"... the caller may provide the same scatter/gather list for the
plaintext and cipher text. After the completion of the cipher
operation, the plaintext data is replaced with the ciphertext data
in case of an encryption and vice versa for a decryption."

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2017-01-26 15:24:41 +0800
0548b8298 libceph: introduce ceph_x_encrypt_offset() ... Browse Code »

commit 55d9cc834f933698fc864f0d36f3cca533d30a8d upstream.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2017-01-26 15:24:41 +0800
be6045761 libceph: old_key in process_one_ticket() is redundant ... Browse Code »

commit 462e650451c577d15eeb4d883d70fa9e4e529fad upstream.

Since commit 0a990e709356 ("ceph: clean up service ticket decoding"),
th->session_key isn't assigned until everything is decoded.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2017-01-26 15:24:41 +0800
2e62bf3c6 libceph: ceph_x_encrypt_buflen() takes in_len ... Browse Code »

commit 36721ece1e84a25130c4befb930509b3f96de020 upstream.

Pass what's going to be encrypted - that's msg_b, not ticket_blob.
ceph_x_encrypt_buflen() returns the upper bound, so this doesn't change
the maxlen calculation, but makes it a bit clearer.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2017-01-26 15:24:41 +0800
73a2e2405 svcrdma: avoid duplicate dma unmapping during error recovery ... Browse Code »

commit ce1ca7d2d140a1f4aaffd297ac487f246963dd2f upstream.

In rdma_read_chunk_frmr() when ib_post_send() fails, the error code path
invokes ib_dma_unmap_sg() to unmap the sg list. It then invokes
svc_rdma_put_frmr() which in turn tries to unmap the same sg list through
ib_dma_unmap_sg() again. This second unmap is invalid and could lead to
problems when the iova being unmapped is subsequently reused. Remove
the call to unmap in rdma_read_chunk_frmr() and let svc_rdma_put_frmr()
handle it.

Fixes: 412a15c0fe53 ("svcrdma: Port to new memory registration API")
Signed-off-by: Sriharsha Basavapatna
Reviewed-by: Chuck Lever
Reviewed-by: Yuval Shaia
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Sriharsha Basavapatna
2017-01-26 15:24:40 +0800
bab10a549 mac80211: implement multicast forwarding on fast-RX path ... Browse Code »

commit eeb0d56fab4cd7848cf2be6704fa48900dbc1381 upstream.

In AP (or VLAN) mode, when unicast 802.11 packets are received,
they might actually be multicast after conversion. In this case
the fast-RX path didn't handle them properly to send them back
to the wireless medium. Implement that by copying the SKB and
sending it back out.

The possible alternative would be to just punt the packet back
to the regular (slow) RX path, but since we have almost all of
the required code here already it's not so complicated to add
here. Punting it back would also mean acquiring the spinlock,
which would be bad for the stated purpose of the fast-RX path,
to enable well-performing parallel RX.

Signed-off-by: Johannes Berg
Signed-off-by: Greg Kroah-Hartman

Johannes Berg
2017-01-26 15:24:39 +0800
f29f3616b svcrpc: don't leak contexts on PROC_DESTROY ... Browse Code »

commit 78794d1890708cf94e3961261e52dcec2cc34722 upstream.

Context expiry times are in units of seconds since boot, not unix time.

The use of get_seconds() here therefore sets the expiry time decades in
the future. This prevents timely freeing of contexts destroyed by
client RPC_GSS_PROC_DESTROY requests. We'd still free them eventually
(when the module is unloaded or the container shut down), but a lot of
contexts could pile up before then.

Fixes: c5b29f885afe "sunrpc: use seconds since boot in expiry cache"
Reported-by: Andy Adamson
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

J. Bruce Fields
2017-01-26 15:24:37 +0800
a297ed84b sunrpc: don't call sleeping functions from the notifier block callbacks ... Browse Code »

commit 546125d1614264d26080817d0c8cddb9b25081fa upstream.

The inet6addr_chain is an atomic notifier chain, so we can't call
anything that might sleep (like lock_sock)... instead of closing the
socket from svc_age_temp_xprts_now (which is called by the notifier
function), just have the rpc service threads do it instead.

Fixes: c3d4879e01be "sunrpc: Add a function to close..."
Signed-off-by: Scott Mayhew
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Scott Mayhew
2017-01-26 15:24:37 +0800

20 Jan, 2017

3 commits

9297e0c18 net/af_iucv: don't use paged skbs for TX on HiperSockets ... Browse Code »

commit dc5367bcc556e97555fc94a32cd1aadbebdff47e upstream.

With commit e53743994e21
("af_iucv: use paged SKBs for big outbound messages"),
we transmit paged skbs for both of AF_IUCV's transport modes
(IUCV or HiperSockets).
The qeth driver for Layer 3 HiperSockets currently doesn't
support NETIF_F_SG, so these skbs would just be linearized again
by the stack.
Avoid that overhead by using paged skbs only for IUCV transport.

cc stable, since this also circumvents a significant skb leak when
sending large messages (where the skb then needs to be linearized).

Signed-off-by: Julian Wiedmann
Signed-off-by: Ursula Braun
Fixes: e53743994e21 ("af_iucv: use paged SKBs for big outbound messages")
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Julian Wiedmann
2017-01-20 03:18:04 +0800
259495a04 bridge: netfilter: Fix dropping packets that moving through bridge interface ... Browse Code »

commit 14221cc45caad2fcab3a8543234bb7eda9b540d5 upstream.

Problem:
br_nf_pre_routing_finish() calls itself instead of
br_nf_pre_routing_finish_bridge(). Due to this bug reverse path filter drops
packets that go through bridge interface.

User impact:
Local docker containers with bridge network can not communicate with each
other.

Fixes: c5136b15ea36 ("netfilter: bridge: add and use br_nf_hook_thresh")
Signed-off-by: Artur Molchanov
Acked-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Greg Kroah-Hartman

Artur Molchanov
2017-01-20 03:18:01 +0800
0a28f5393 nl80211: fix sched scan netlink socket owner destruction ... Browse Code »

commit 753aacfd2e95df6a0caf23c03dc309020765bea9 upstream.

A single netlink socket might own multiple interfaces *and* a
scheduled scan request (which might belong to another interface),
so when it goes away both may need to be destroyed.

Remove the schedule_scan_stop indirection to fix this - it's only
needed for interface destruction because of the way this works
right now, with a single work taking care of all interfaces.

Fixes: 93a1e86ce10e4 ("nl80211: Stop scheduled scan if netlink client disappears")
Signed-off-by: Johannes Berg
Signed-off-by: Greg Kroah-Hartman

Johannes Berg
2017-01-20 03:18:00 +0800

15 Jan, 2017

4 commits

bd99e7a60 svcrdma: Clear xpt_bc_xps in xprt_setup_rdma_bc() error exit arm ... Browse Code »

commit 1b9f700b8cfc31089e2dfa5d0905c52fd4529b50 upstream.

Logic copied from xs_setup_bc_tcp().

Fixes: 39a9beab5acb ('rpc: share one xps between all backchannels')
Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Chuck Lever
2017-01-15 20:42:56 +0800
7b7a5a85b net: ipv4: Fix multipath selection with vrf ... Browse Code »

[ Upstream commit 7a18c5b9fb31a999afc62b0e60978aa896fc89e9 ]

fib_select_path does not call fib_select_multipath if oif is set in the
flow struct. For VRF use cases oif is always set, so multipath route
selection is bypassed. Use the FLOWI_FLAG_SKIP_NH_OIF to skip the oif
check similar to what is done in fib_table_lookup.

Add saddr and proto to the flow struct for the fib lookup done by the
VRF driver to better match hash computation for a flow.

Fixes: 613d09b30f8b ("net: Use VRF device index for lookups on TX")
Signed-off-by: David Ahern
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

David Ahern
2017-01-15 20:42:55 +0800
17a561b19 gro: Disable frag0 optimization on IPv6 ext headers ... Browse Code »

[ Upstream commit 57ea52a865144aedbcd619ee0081155e658b6f7d ]

The GRO fast path caches the frag0 address. This address becomes
invalid if frag0 is modified by pskb_may_pull or its variants.
So whenever that happens we must disable the frag0 optimization.

This is usually done through the combination of gro_header_hard
and gro_header_slow, however, the IPv6 extension header path did
the pulling directly and would continue to use the GRO fast path
incorrectly.

This patch fixes it by disabling the fast path when we enter the
IPv6 extension header path.

Fixes: 78a478d0efd9 ("gro: Inline skb_gro_header and cache frag0 virtual address")
Reported-by: Slava Shwartsman
Signed-off-by: Herbert Xu
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Herbert Xu
2017-01-15 20:42:55 +0800
934ca017c gro: use min_t() in skb_gro_reset_offset() ... Browse Code »

[ Upstream commit 7cfd5fd5a9813f1430290d20c0fead9b4582a307 ]

On 32bit arches, (skb->end - skb->data) is not 'unsigned int',
so we shall use min_t() instead of min() to avoid a compiler error.

Fixes: 1272ce87fa01 ("gro: Enter slow-path if there is no tailroom")
Reported-by: kernel test robot
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2017-01-15 20:42:55 +0800