30 Sep, 2016
1 commit
-
This is to use the generic interfaces snmp_get_cpu_field{,64}_batch to
aggregate the data by going through all the items of each cpu sequentially.Signed-off-by: Jia He
Signed-off-by: David S. Miller
24 Sep, 2016
1 commit
-
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2016-09-23Only two patches this time:
1) Fix a comment reference to struct xfrm_replay_state_esn.
From Richard Guy Briggs.2) Convert xfrm_state_lookup to rcu, we don't need the
xfrm_state_lock anymore in the input path.
From Florian Westphal.Please pull or let me know if there are problems.
====================Signed-off-by: David S. Miller
23 Sep, 2016
1 commit
21 Sep, 2016
1 commit
-
This is called from the packet input path, we get lock contention
if many cpus handle ipsec in parallel.After recent rcu conversion it is safe to call __xfrm_state_lookup
without the spinlock.Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert
19 Sep, 2016
1 commit
-
commit 1a6509d99122 ("[IPSEC]: Add support for combined mode algorithms")
introduced aead. The function attach_aead kmemdup()s the algorithm
name during xfrm_state_construct().
However this memory is never freed.
Implementation has since been slightly modified in
commit ee5c23176fcc ("xfrm: Clone states properly on migration")
without resolving this leak.
This patch adds a kfree() call for the aead algorithm name.Fixes: 1a6509d99122 ("[IPSEC]: Add support for combined mode algorithms")
Signed-off-by: Ilan Tayari
Acked-by: Rami Rosen
Signed-off-by: Steffen Klassert
13 Sep, 2016
1 commit
-
Conflicts:
drivers/net/ethernet/mediatek/mtk_eth_soc.c
drivers/net/ethernet/qlogic/qed/qed_dcbx.c
drivers/net/phy/KconfigAll conflicts were cases of overlapping commits.
Signed-off-by: David S. Miller
11 Sep, 2016
1 commit
-
The IS_ENABLED() macro checks if a Kconfig symbol has been enabled either
built-in or as a module, use that macro instead of open coding the same.Using the macro makes the code more readable by helping abstract away some
of the Kconfig built-in and module enable details.Signed-off-by: Javier Martinez Canillas
Signed-off-by: David S. Miller
09 Sep, 2016
2 commits
-
When we fail to attach the security context in xfrm_state_construct()
we'll return 0 as error value which, in turn, will wrongly claim success
to userland when, in fact, we won't be adding / updating the XFRM state.This is a regression introduced by commit fd21150a0fe1 ("[XFRM] netlink:
Inline attach_encap_tmpl(), attach_sec_ctx(), and attach_one_addr()").Fix it by propagating the error returned by security_xfrm_state_alloc()
in this case.Fixes: fd21150a0fe1 ("[XFRM] netlink: Inline attach_encap_tmpl()...")
Signed-off-by: Mathias Krause
Cc: Thomas Graf
Signed-off-by: Steffen Klassert -
Steffen Klassert says:
====================
ipsec-next 2016-09-081) Constify the xfrm_replay structures. From Julia Lawall
2) Protect xfrm state hash tables with rcu, lookups
can be done now without acquiring xfrm_state_lock.
From Florian Westphal.3) Protect xfrm policy hash tables with rcu, lookups
can be done now without acquiring xfrm_policy_lock.
From Florian Westphal.4) We don't need to have a garbage collector list per
namespace anymore, so use a global one instead.
From Florian Westphal.
====================Signed-off-by: David S. Miller
08 Sep, 2016
1 commit
-
proc_dointvec limits the values to INT_MAX in u32 sysctl entries.
proc_douintvec allows to write upto UINT_MAX.Signed-off-by: Subash Abhinov Kasiviswanathan
Signed-off-by: David S. Miller
24 Aug, 2016
2 commits
-
After commit 5b8ef3415a21f173
("xfrm: Remove ancient sleeping when the SA is in acquire state")
gc does not need any per-netns data anymore.As far as gc is concerned all state structs are the same, so we
can use a global work struct for it.Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
An earlier patch accidentally replaced a write_lock_bh
with a spin_unlock_bh. Fix this by using spin_lock_bh
instead.Fixes: 9d0380df6217 ("xfrm: policy: convert policy_lock to spinlock")
Signed-off-by: Steffen Klassert
12 Aug, 2016
8 commits
-
After earlier patches conversions all spots acquire the writer lock and
we can now convert this to a normal spinlock.Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
It doesn't seem that important.
We now get inconsistent view of the counters, but those are stale anyway
right after we drop the lock.Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
Don't acquire the readlock anymore and rely on rcu alone.
In case writer on other CPU changed policy at the wrong moment (after we
obtained sk policy pointer but before we could obtain the reference)
just repeat the lookup.Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
side effect: no longer disables BH (should be fine).
Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
If we don't hold the policy lock anymore the refcnt might
already be 0, i.e. policy struct is about to be free'd.Switch to atomic_inc_not_zero to avoid this.
On removal policies are already unlinked from the tables (lists)
before the last _put occurs so we are not supposed to find the same
'dead' entry on the next loop, so its safe to just repeat the lookup.Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
Once xfrm_policy_lookup_bytype doesn't grab xfrm_policy_lock anymore its
possible for a hash resize to occur in parallel.Use sequence counter to block lookup in case a resize is in
progress and to also re-lookup in case hash table was altered
in the mean time (might cause use to not find the best-match).Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
Since commit 56f047305dd4b6b617
("xfrm: add rcu grace period in xfrm_policy_destroy()") xfrm policy
objects are already free'd via rcu.In order to make more places lockless (i.e. use rcu_read_lock instead of
grabbing read-side of policy rwlock) we only need to:- use rcu_assign_pointer to store address of new hash table backend memory
- add rcu barrier so that freeing of old memory is delayed (expansion
and free happens from system workqueue, so synchronize_rcu is fine)
- use rcu_dereference to fetch current address of the hash table.Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
This is required once we allow lockless readers.
Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert
11 Aug, 2016
1 commit
-
Running LTP 'icmp-uni-basic.sh -6 -p ipcomp -m tunnel' test over
openvswitch + veth can trigger kernel panic:BUG: unable to handle kernel NULL pointer dereference
at 00000000000000e0 IP: [] xfrm_input+0x82/0x750
...
[] xfrm6_rcv_spi+0x1e/0x20
[] xfrm6_tunnel_rcv+0x42/0x50 [xfrm6_tunnel]
[] tunnel6_rcv+0x3e/0x8c [tunnel6]
[] ip6_input_finish+0xd5/0x430
[] ip6_input+0x33/0x90
[] ip6_rcv_finish+0xa5/0xb0
...It seems that tunnel.ip6 can have garbage values and also dereferenced
without a proper check, only tunnel.ip4 is being verified. Fix it by
adding one more if block for AF_INET6 and initialize tunnel.ip6 with NULL
inside xfrm6_rcv_spi() (which is similar to xfrm4_rcv_spi()).Fixes: 049f8e2 ("xfrm: Override skb->mark with tunnel->parm.i_key in xfrm_input")
Signed-off-by: Alexey Kodanev
Signed-off-by: Steffen Klassert
10 Aug, 2016
7 commits
-
push the lock down, after earlier patches we can rely on rcu to
make sure state struct won't go away.Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
Before xfrm_state_find() can use rcu_read_lock instead of xfrm_state_lock
we need to switch users of the hash table to assign/obtain the pointers
with the appropriate rcu helpers.Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
Once xfrm_state_find is lockless we have to cope with a concurrent
resize opertion.We use a sequence counter to block in case a resize is in progress
and to detect if we might have missed a state that got moved to
a new hash table.Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
The hash table backend memory and the state structs are free'd via
kfree/vfree.Once we only rely on rcu during lookups we have to make sure no other cpu
is currently accessing this before doing the free.Free operations already happen from worker so we can use synchronize_rcu
to wait until concurrent readers are done.Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
Once xfrm_state_lookup_byaddr no longer acquires the state lock another
cpu might be freeing the state entry at the same time.To detect this we use atomic_inc_not_zero, we then signal -EAGAIN to
caller in case our result was stale.Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
This is required once we allow lockless access of bydst/bysrc hash tables.
Signed-off-by: Florian Westphal
Signed-off-by: Steffen Klassert -
The xfrm_replay structures are never modified, so declare them as const.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall
Signed-off-by: Steffen Klassert
29 Jul, 2016
1 commit
-
Whenever thresholds are changed the hash tables are rebuilt. This is
done by enumerating all policies and hashing and inserting them into
the right table according to the thresholds and direction.Because socket policies are also contained in net->xfrm.policy_all but
no hash tables are defined for their direction (dir + XFRM_POLICY_MAX)
this causes a NULL or invalid pointer dereference after returning from
policy_hash_bysel() if the rebuild is done while any socket policies
are installed.Since the rebuild after changing thresholds is scheduled this crash
could even occur if the userland sets thresholds seemingly before
installing any socket policies.Fixes: 53c2e285f970 ("xfrm: Do not hash socket policies")
Signed-off-by: Tobias Brunner
Acked-by: Herbert Xu
Signed-off-by: Steffen Klassert
27 Jul, 2016
2 commits
-
During fuzzing I regularly run into this WARN(). According to Herbert Xu,
this "certainly shouldn't be a WARN, it probably shouldn't print anything
either".Cc: Stephen Hemminger
Cc: Steffen Klassert
Cc: Herbert Xu
Signed-off-by: Vegard Nossum
Signed-off-by: Steffen Klassert -
AFAICT this message is just printed whenever input validation fails.
This is a normal failure and we shouldn't be dumping the stack over it.Looks like it was originally a printk that was maybe incorrectly
upgraded to a WARN:commit 62db5cfd70b1ef53aa21f144a806fe3b78c84fab
Author: stephen hemminger
Date: Wed May 12 06:37:06 2010 +0000xfrm: add severity to printk
Cc: Stephen Hemminger
Cc: Steffen Klassert
Signed-off-by: Vegard Nossum
Signed-off-by: Steffen Klassert
18 Jul, 2016
1 commit
-
If we hit any of the error conditions inside xfrm_dump_sa(), then
xfrm_state_walk_init() never gets called. However, we still call
xfrm_state_walk_done() from xfrm_dump_sa_done(), which will crash
because the state walk was never initialized properly.We can fix this by setting cb->args[0] only after we've processed the
first element and checking this before calling xfrm_state_walk_done().Fixes: d3623099d3 ("ipsec: add support of limited SA dump")
Cc: Nicolas Dichtel
Cc: Steffen Klassert
Signed-off-by: Vegard Nossum
Acked-by: Nicolas Dichtel
Signed-off-by: Steffen Klassert
10 May, 2016
1 commit
-
In netdevice.h we removed the structure in net-next that is being
changes in 'net'. In macsec.c and rtnetlink.c we have overlaps
between fixes in 'net' and the u64 attribute changes in 'net-next'.The mlx5 conflicts have to do with vxlan support dependencies.
Signed-off-by: David S. Miller
05 May, 2016
1 commit
-
Steffen Klassert says:
====================
pull request (net): ipsec 2016-05-041) The flowcache can hit an OOM condition if too
many entries are in the gc_list. Fix this by
counting the entries in the gc_list and refuse
new allocations if the value is too high.2) The inner headers are invalid after a xfrm transformation,
so reset the skb encapsulation field to ensure nobody tries
access the inner headers. Otherwise tunnel devices stacked
on top of xfrm may build the outer headers based on wrong
informations.3) Add pmtu handling to vti, we need it to report
pmtu informations for local generated packets.Please pull or let me know if there are problems.
====================Signed-off-by: David S. Miller
24 Apr, 2016
1 commit
-
Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller
25 Mar, 2016
1 commit
-
A crash is observed when a decrypted packet is processed in receive
path. get_rps_cpus() tries to dereference the skb->dev fields but it
appears that the device is freed from the poison pattern.[] get_rps_cpu+0x94/0x2f0
[] netif_rx_internal+0x140/0x1cc
[] netif_rx+0x74/0x94
[] xfrm_input+0x754/0x7d0
[] xfrm_input_resume+0x10/0x1c
[] esp_input_done+0x20/0x30
[] process_one_work+0x244/0x3fc
[] worker_thread+0x2f8/0x418
[] kthread+0xe0/0xec-013|get_rps_cpu(
| dev = 0xFFFFFFC08B688000,
| skb = 0xFFFFFFC0C76AAC00 -> (
| dev = 0xFFFFFFC08B688000 -> (
| name =
"......................................................
| name_hlist = (next = 0xAAAAAAAAAAAAAAAA, pprev =
0xAAAAAAAAAAAFollowing are the sequence of events observed -
- Encrypted packet in receive path from netdevice is queued
- Encrypted packet queued for decryption (asynchronous)
- Netdevice brought down and freed
- Packet is decrypted and returned through callback in esp_input_done
- Packet is queued again for process in network stack using netif_rxSince the device appears to have been freed, the dereference of
skb->dev in get_rps_cpus() leads to an unhandled page fault
exception.Fix this by holding on to device reference when queueing packets
asynchronously and releasing the reference on call back return.v2: Make the change generic to xfrm as mentioned by Steffen and
update the title to xfrmSuggested-by: Herbert Xu
Signed-off-by: Jerome Stanislaus
Signed-off-by: Subash Abhinov Kasiviswanathan
Signed-off-by: David S. Miller
23 Mar, 2016
1 commit
-
The code wants to prevent compat code from receiving messages. Use
in_compat_syscall for this.Signed-off-by: Andy Lutomirski
Cc: Steffen Klassert
Cc: Herbert Xu
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
17 Mar, 2016
1 commit
-
The inner headers are invalid after a xfrm transformation.
So reset the skb encapsulation field to ensure nobody tries
to access the inner headers.Signed-off-by: Steffen Klassert
27 Jan, 2016
1 commit
-
This patch removes the last reference to hash and ablkcipher from
IPsec and replaces them with ahash and skcipher respectively. For
skcipher there is currently no difference at all, while for ahash
the current code is actually buggy and would prevent asynchronous
algorithms from being discovered.Signed-off-by: Herbert Xu
Acked-by: David S. Miller
16 Jan, 2016
1 commit
-
Skb_gso_segment() uses skb control block during segmentation.
This patch adds 32-bytes room for previous control block which
will be copied into all resulting segments.This patch fixes kernel crash during fragmenting forwarded packets.
Fragmentation requires valid IP CB in skb for clearing ip options.
Also patch removes custom save/restore in ovs code, now it's redundant.Signed-off-by: Konstantin Khlebnikov
Link: http://lkml.kernel.org/r/CALYGNiP-0MZ-FExV2HutTvE9U-QQtkKSoE--KN=JQE5STYsjAA@mail.gmail.com
Signed-off-by: David S. Miller