Eric Lee / smarc-fsl-linux-kernel

30 Sep, 2016

1 commit

07613873f proc: Reduce cache miss in xfrm_statistics_seq_show ... Browse Code »

This is to use the generic interfaces snmp_get_cpu_field{,64}_batch to
aggregate the data by going through all the items of each cpu sequentially.

Signed-off-by: Jia He
Signed-off-by: David S. Miller

Jia He
2016-09-30 13:50:45 +0800

24 Sep, 2016

1 commit

1678c1134 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next ... Browse Code »

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2016-09-23

Only two patches this time:

1) Fix a comment reference to struct xfrm_replay_state_esn.
From Richard Guy Briggs.

2) Convert xfrm_state_lookup to rcu, we don't need the
xfrm_state_lock anymore in the input path.
From Florian Westphal.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller

David S. Miller
2016-09-24 20:18:19 +0800

23 Sep, 2016

1 commit

d6989d4bb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2016-09-23 18:46:57 +0800

21 Sep, 2016

1 commit

c2f672fc9 xfrm: state lookup can be lockless ... Browse Code »

This is called from the packet input path, we get lock contention
if many cpus handle ipsec in parallel.

After recent rcu conversion it is safe to call __xfrm_state_lookup
without the spinlock.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-09-21 18:37:29 +0800

19 Sep, 2016

1 commit

b58847935 xfrm: Fix memory leak of aead algorithm name ... Browse Code »

commit 1a6509d99122 ("[IPSEC]: Add support for combined mode algorithms")
introduced aead. The function attach_aead kmemdup()s the algorithm
name during xfrm_state_construct().
However this memory is never freed.
Implementation has since been slightly modified in
commit ee5c23176fcc ("xfrm: Clone states properly on migration")
without resolving this leak.
This patch adds a kfree() call for the aead algorithm name.

Fixes: 1a6509d99122 ("[IPSEC]: Add support for combined mode algorithms")
Signed-off-by: Ilan Tayari
Acked-by: Rami Rosen
Signed-off-by: Steffen Klassert

Ilan Tayari
2016-09-19 18:08:58 +0800

13 Sep, 2016

1 commit

b20b378d4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/net/ethernet/mediatek/mtk_eth_soc.c
drivers/net/ethernet/qlogic/qed/qed_dcbx.c
drivers/net/phy/Kconfig

All conflicts were cases of overlapping commits.

Signed-off-by: David S. Miller

David S. Miller
2016-09-13 06:52:44 +0800

11 Sep, 2016

1 commit

65b323e2f xfrm: use IS_ENABLED() instead of checking for built-in or module ... Browse Code »

The IS_ENABLED() macro checks if a Kconfig symbol has been enabled either
built-in or as a module, use that macro instead of open coding the same.

Using the macro makes the code more readable by helping abstract away some
of the Kconfig built-in and module enable details.

Signed-off-by: Javier Martinez Canillas
Signed-off-by: David S. Miller

Javier Martinez Canillas
2016-09-11 12:19:11 +0800

09 Sep, 2016

2 commits

2f30ea509 xfrm_user: propagate sec ctx allocation errors ... Browse Code »

When we fail to attach the security context in xfrm_state_construct()
we'll return 0 as error value which, in turn, will wrongly claim success
to userland when, in fact, we won't be adding / updating the XFRM state.

This is a regression introduced by commit fd21150a0fe1 ("[XFRM] netlink:
Inline attach_encap_tmpl(), attach_sec_ctx(), and attach_one_addr()").

Fix it by propagating the error returned by security_xfrm_state_alloc()
in this case.

Fixes: fd21150a0fe1 ("[XFRM] netlink: Inline attach_encap_tmpl()...")
Signed-off-by: Mathias Krause
Cc: Thomas Graf
Signed-off-by: Steffen Klassert

Mathias Krause
2016-09-09 15:02:08 +0800
575f9c43e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next ... Browse Code »

Steffen Klassert says:

====================
ipsec-next 2016-09-08

1) Constify the xfrm_replay structures. From Julia Lawall

2) Protect xfrm state hash tables with rcu, lookups
can be done now without acquiring xfrm_state_lock.
From Florian Westphal.

3) Protect xfrm policy hash tables with rcu, lookups
can be done now without acquiring xfrm_policy_lock.
From Florian Westphal.

4) We don't need to have a garbage collector list per
namespace anymore, so use a global one instead.
From Florian Westphal.
====================

Signed-off-by: David S. Miller

David S. Miller
2016-09-09 04:09:41 +0800

08 Sep, 2016

1 commit

0f76d2564 net: xfrm: Change u32 sysctl entries to use proc_douintvec ... Browse Code »

proc_dointvec limits the values to INT_MAX in u32 sysctl entries.
proc_douintvec allows to write upto UINT_MAX.

Signed-off-by: Subash Abhinov Kasiviswanathan
Signed-off-by: David S. Miller

subashab@codeaurora.org
2016-09-08 14:17:53 +0800

24 Aug, 2016

2 commits

35db57bbc xfrm: state: remove per-netns gc task ... Browse Code »

After commit 5b8ef3415a21f173
("xfrm: Remove ancient sleeping when the SA is in acquire state")
gc does not need any per-netns data anymore.

As far as gc is concerned all state structs are the same, so we
can use a global work struct for it.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-24 19:16:06 +0800
4141b36ab xfrm: Fix xfrm_policy_lock imbalance ... Browse Code »

An earlier patch accidentally replaced a write_lock_bh
with a spin_unlock_bh. Fix this by using spin_lock_bh
instead.

Fixes: 9d0380df6217 ("xfrm: policy: convert policy_lock to spinlock")
Signed-off-by: Steffen Klassert

Steffen Klassert
2016-08-24 19:13:08 +0800

12 Aug, 2016

8 commits

9d0380df6 xfrm: policy: convert policy_lock to spinlock ... Browse Code »

After earlier patches conversions all spots acquire the writer lock and
we can now convert this to a normal spinlock.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-12 14:07:12 +0800
d5b8f86dc xfrm: policy: don't acquire policy lock in xfrm_spd_getinfo ... Browse Code »

It doesn't seem that important.

We now get inconsistent view of the counters, but those are stale anyway
right after we drop the lock.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-12 14:07:12 +0800
ae33786f7 xfrm: policy: only use rcu in xfrm_sk_policy_lookup ... Browse Code »

Don't acquire the readlock anymore and rely on rcu alone.

In case writer on other CPU changed policy at the wrong moment (after we
obtained sk policy pointer but before we could obtain the reference)
just repeat the lookup.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-12 14:07:12 +0800
a7c44247f xfrm: policy: make xfrm_policy_lookup_bytype lockless ... Browse Code »

side effect: no longer disables BH (should be fine).

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-12 14:07:12 +0800
e37cc8ade xfrm: policy: use atomic_inc_not_zero in rcu section ... Browse Code »

If we don't hold the policy lock anymore the refcnt might
already be 0, i.e. policy struct is about to be free'd.

Switch to atomic_inc_not_zero to avoid this.

On removal policies are already unlinked from the tables (lists)
before the last _put occurs so we are not supposed to find the same
'dead' entry on the next loop, so its safe to just repeat the lookup.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-12 14:07:11 +0800
30846090a xfrm: policy: add sequence count to sync with hash resize ... Browse Code »

Once xfrm_policy_lookup_bytype doesn't grab xfrm_policy_lock anymore its
possible for a hash resize to occur in parallel.

Use sequence counter to block lookup in case a resize is in
progress and to also re-lookup in case hash table was altered
in the mean time (might cause use to not find the best-match).

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-12 14:07:11 +0800
e1e551bc5 xfrm: policy: prepare policy_bydst hash for rcu lookups ... Browse Code »

Since commit 56f047305dd4b6b617
("xfrm: add rcu grace period in xfrm_policy_destroy()") xfrm policy
objects are already free'd via rcu.

In order to make more places lockless (i.e. use rcu_read_lock instead of
grabbing read-side of policy rwlock) we only need to:

- use rcu_assign_pointer to store address of new hash table backend memory
- add rcu barrier so that freeing of old memory is delayed (expansion
and free happens from system workqueue, so synchronize_rcu is fine)
- use rcu_dereference to fetch current address of the hash table.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-12 14:07:11 +0800
a5eefc1df xfrm: policy: use rcu versions for iteration and list add/del ... Browse Code »

This is required once we allow lockless readers.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-12 14:07:11 +0800

11 Aug, 2016

1 commit

1625f4529 net/xfrm_input: fix possible NULL deref of tunnel.ip6->parms.i_key ... Browse Code »

Running LTP 'icmp-uni-basic.sh -6 -p ipcomp -m tunnel' test over
openvswitch + veth can trigger kernel panic:

BUG: unable to handle kernel NULL pointer dereference
at 00000000000000e0 IP: [] xfrm_input+0x82/0x750
...
[] xfrm6_rcv_spi+0x1e/0x20
[] xfrm6_tunnel_rcv+0x42/0x50 [xfrm6_tunnel]
[] tunnel6_rcv+0x3e/0x8c [tunnel6]
[] ip6_input_finish+0xd5/0x430
[] ip6_input+0x33/0x90
[] ip6_rcv_finish+0xa5/0xb0
...

It seems that tunnel.ip6 can have garbage values and also dereferenced
without a proper check, only tunnel.ip4 is being verified. Fix it by
adding one more if block for AF_INET6 and initialize tunnel.ip6 with NULL
inside xfrm6_rcv_spi() (which is similar to xfrm4_rcv_spi()).

Fixes: 049f8e2 ("xfrm: Override skb->mark with tunnel->parm.i_key in xfrm_input")

Signed-off-by: Alexey Kodanev
Signed-off-by: Steffen Klassert

Alexey Kodanev
2016-08-11 19:15:57 +0800

10 Aug, 2016

7 commits

d737a5805 xfrm: state: don't use lock anymore unless acquire operation is needed ... Browse Code »

push the lock down, after earlier patches we can rely on rcu to
make sure state struct won't go away.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-10 17:23:24 +0800
c8406998b xfrm: state: use rcu_deref and assign_pointer helpers ... Browse Code »

Before xfrm_state_find() can use rcu_read_lock instead of xfrm_state_lock
we need to switch users of the hash table to assign/obtain the pointers
with the appropriate rcu helpers.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-10 17:23:24 +0800
b65e3d7be xfrm: state: add sequence count to detect hash resizes ... Browse Code »

Once xfrm_state_find is lockless we have to cope with a concurrent
resize opertion.

We use a sequence counter to block in case a resize is in progress
and to detect if we might have missed a state that got moved to
a new hash table.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-10 17:23:24 +0800
df7274eb7 xfrm: state: delay freeing until rcu grace period has elapsed ... Browse Code »

The hash table backend memory and the state structs are free'd via
kfree/vfree.

Once we only rely on rcu during lookups we have to make sure no other cpu
is currently accessing this before doing the free.

Free operations already happen from worker so we can use synchronize_rcu
to wait until concurrent readers are done.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-10 17:23:23 +0800
02efdff7e xfrm: state: use atomic_inc_not_zero to increment refcount ... Browse Code »

Once xfrm_state_lookup_byaddr no longer acquires the state lock another
cpu might be freeing the state entry at the same time.

To detect this we use atomic_inc_not_zero, we then signal -EAGAIN to
caller in case our result was stale.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-10 17:23:23 +0800
ae3fb6d32 xfrm: state: use hlist_for_each_entry_rcu helper ... Browse Code »

This is required once we allow lockless access of bydst/bysrc hash tables.

Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert

Florian Westphal
2016-08-10 17:23:23 +0800
e45a8a9e6 xfrm: constify xfrm_replay structures ... Browse Code »

The xfrm_replay structures are never modified, so declare them as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall
Signed-off-by: Steffen Klassert

Julia Lawall
2016-08-10 17:18:49 +0800

29 Jul, 2016

1 commit

6916fb3b1 xfrm: Ignore socket policies when rebuilding hash tables ... Browse Code »

Whenever thresholds are changed the hash tables are rebuilt. This is
done by enumerating all policies and hashing and inserting them into
the right table according to the thresholds and direction.

Because socket policies are also contained in net->xfrm.policy_all but
no hash tables are defined for their direction (dir + XFRM_POLICY_MAX)
this causes a NULL or invalid pointer dereference after returning from
policy_hash_bysel() if the rebuild is done while any socket policies
are installed.

Since the rebuild after changing thresholds is scheduled this crash
could even occur if the userland sets thresholds seemingly before
installing any socket policies.

Fixes: 53c2e285f970 ("xfrm: Do not hash socket policies")
Signed-off-by: Tobias Brunner
Acked-by: Herbert Xu
Signed-off-by: Steffen Klassert

Tobias Brunner
2016-07-29 16:21:54 +0800

27 Jul, 2016

2 commits

7677c7560 xfrm: get rid of another incorrect WARN ... Browse Code »

During fuzzing I regularly run into this WARN(). According to Herbert Xu,
this "certainly shouldn't be a WARN, it probably shouldn't print anything
either".

Cc: Stephen Hemminger
Cc: Steffen Klassert
Cc: Herbert Xu
Signed-off-by: Vegard Nossum
Signed-off-by: Steffen Klassert

Vegard Nossum
2016-07-27 19:09:00 +0800
73efc3245 xfrm: get rid of incorrect WARN ... Browse Code »

AFAICT this message is just printed whenever input validation fails.
This is a normal failure and we shouldn't be dumping the stack over it.

Looks like it was originally a printk that was maybe incorrectly
upgraded to a WARN:

commit 62db5cfd70b1ef53aa21f144a806fe3b78c84fab
Author: stephen hemminger
Date: Wed May 12 06:37:06 2010 +0000

xfrm: add severity to printk

Cc: Stephen Hemminger
Cc: Steffen Klassert
Signed-off-by: Vegard Nossum
Signed-off-by: Steffen Klassert

Vegard Nossum
2016-07-27 19:07:46 +0800

18 Jul, 2016

1 commit

1ba5bf993 xfrm: fix crash in XFRM_MSG_GETSA netlink handler ... Browse Code »

If we hit any of the error conditions inside xfrm_dump_sa(), then
xfrm_state_walk_init() never gets called. However, we still call
xfrm_state_walk_done() from xfrm_dump_sa_done(), which will crash
because the state walk was never initialized properly.

We can fix this by setting cb->args[0] only after we've processed the
first element and checking this before calling xfrm_state_walk_done().

Fixes: d3623099d3 ("ipsec: add support of limited SA dump")
Cc: Nicolas Dichtel
Cc: Steffen Klassert
Signed-off-by: Vegard Nossum
Acked-by: Nicolas Dichtel
Signed-off-by: Steffen Klassert

Vegard Nossum
2016-07-18 15:37:02 +0800

10 May, 2016

1 commit

e800072c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

In netdevice.h we removed the structure in net-next that is being
changes in 'net'. In macsec.c and rtnetlink.c we have overlaps
between fixes in 'net' and the u64 attribute changes in 'net-next'.

The mlx5 conflicts have to do with vxlan support dependencies.

Signed-off-by: David S. Miller

David S. Miller
2016-05-10 03:59:24 +0800

05 May, 2016

1 commit

32b583a0c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec ... Browse Code »

Steffen Klassert says:

====================
pull request (net): ipsec 2016-05-04

1) The flowcache can hit an OOM condition if too
many entries are in the gc_list. Fix this by
counting the entries in the gc_list and refuse
new allocations if the value is too high.

2) The inner headers are invalid after a xfrm transformation,
so reset the skb encapsulation field to ensure nobody tries
access the inner headers. Otherwise tunnel devices stacked
on top of xfrm may build the outer headers based on wrong
informations.

3) Add pmtu handling to vti, we need it to report
pmtu informations for local generated packets.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller

David S. Miller
2016-05-05 04:35:31 +0800

24 Apr, 2016

1 commit

de95c4a46 xfrm: align nlattr properly when needed ... Browse Code »

Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller

Nicolas Dichtel
2016-04-24 08:13:25 +0800

25 Mar, 2016

1 commit

071d36bf2 xfrm: Fix crash observed during device unregistration and decryption ... Browse Code »

A crash is observed when a decrypted packet is processed in receive
path. get_rps_cpus() tries to dereference the skb->dev fields but it
appears that the device is freed from the poison pattern.

[] get_rps_cpu+0x94/0x2f0
[] netif_rx_internal+0x140/0x1cc
[] netif_rx+0x74/0x94
[] xfrm_input+0x754/0x7d0
[] xfrm_input_resume+0x10/0x1c
[] esp_input_done+0x20/0x30
[] process_one_work+0x244/0x3fc
[] worker_thread+0x2f8/0x418
[] kthread+0xe0/0xec

-013|get_rps_cpu(
| dev = 0xFFFFFFC08B688000,
| skb = 0xFFFFFFC0C76AAC00 -> (
| dev = 0xFFFFFFC08B688000 -> (
| name =
"......................................................
| name_hlist = (next = 0xAAAAAAAAAAAAAAAA, pprev =
0xAAAAAAAAAAA

Following are the sequence of events observed -

- Encrypted packet in receive path from netdevice is queued
- Encrypted packet queued for decryption (asynchronous)
- Netdevice brought down and freed
- Packet is decrypted and returned through callback in esp_input_done
- Packet is queued again for process in network stack using netif_rx

Since the device appears to have been freed, the dereference of
skb->dev in get_rps_cpus() leads to an unhandled page fault
exception.

Fix this by holding on to device reference when queueing packets
asynchronously and releasing the reference on call back return.

v2: Make the change generic to xfrm as mentioned by Steffen and
update the title to xfrm

Suggested-by: Herbert Xu
Signed-off-by: Jerome Stanislaus
Signed-off-by: Subash Abhinov Kasiviswanathan
Signed-off-by: David S. Miller

subashab@codeaurora.org
2016-03-25 02:29:36 +0800

23 Mar, 2016

1 commit

2bf8c4762 net/xfrm_user: use in_compat_syscall to deny compat syscalls ... Browse Code »

The code wants to prevent compat code from receiving messages. Use
in_compat_syscall for this.

Signed-off-by: Andy Lutomirski
Cc: Steffen Klassert
Cc: Herbert Xu
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andy Lutomirski
2016-03-23 06:36:02 +0800

17 Mar, 2016

1 commit

215276c01 xfrm: Reset encapsulation field of the skb before transformation ... Browse Code »

The inner headers are invalid after a xfrm transformation.
So reset the skb encapsulation field to ensure nobody tries
to access the inner headers.

Signed-off-by: Steffen Klassert

Steffen Klassert
2016-03-17 17:28:44 +0800

27 Jan, 2016

1 commit

17bc19702 ipsec: Use skcipher and ahash when probing algorithms ... Browse Code »

This patch removes the last reference to hash and ablkcipher from
IPsec and replaces them with ahash and skcipher respectively. For
skcipher there is currently no difference at all, while for ahash
the current code is actually buggy and would prevent asynchronous
algorithms from being discovered.

Signed-off-by: Herbert Xu
Acked-by: David S. Miller

Herbert Xu
2016-01-27 20:36:07 +0800

16 Jan, 2016

1 commit

9207f9d45 net: preserve IP control block during GSO segmentation ... Browse Code »

Skb_gso_segment() uses skb control block during segmentation.
This patch adds 32-bytes room for previous control block which
will be copied into all resulting segments.

This patch fixes kernel crash during fragmenting forwarded packets.
Fragmentation requires valid IP CB in skb for clearing ip options.
Also patch removes custom save/restore in ovs code, now it's redundant.

Signed-off-by: Konstantin Khlebnikov
Link: http://lkml.kernel.org/r/CALYGNiP-0MZ-FExV2HutTvE9U-QQtkKSoE--KN=JQE5STYsjAA@mail.gmail.com
Signed-off-by: David S. Miller

Konstantin Khlebnikov
2016-01-16 03:35:24 +0800