10 Nov, 2020
1 commit
-
if xfrm_get_translator() failed, xfrm_user_policy() return without
freeing 'data', which is allocated in memdup_sockptr().Fixes: 96392ee5a13b ("xfrm/compat: Translate 32-bit user_policy from sockptr")
Reported-by: Hulk Robot
Signed-off-by: Yu Kuai
Signed-off-by: Steffen Klassert
09 Nov, 2020
3 commits
-
32-bit to 64-bit messages translator zerofies needed paddings in the
translation, the rest is the actual payload.
Don't allocate zero pages as they are not needed.Fixes: 5106f4a8acff ("xfrm/compat: Add 32=>64-bit messages translator")
Signed-off-by: Dmitry Safonov
Signed-off-by: Steffen Klassert -
32-bit messages translated by xfrm_compat can have attributes attached.
For all, but XFRMA_SA, XFRMA_POLICY the size of payload is the same
in 32-bit UABI and 64-bit UABI. For XFRMA_SA (struct xfrm_usersa_info)
and XFRMA_POLICY (struct xfrm_userpolicy_info) it's only tail-padding
that is present in 64-bit payload, but not in 32-bit.
The proper size for destination nlattr is already calculated by
xfrm_user_rcv_calculate_len64() and allocated with kvmalloc().xfrm_attr_cpy32() copies 32-bit copy_len into 64-bit attribute
translated payload, zero-filling possible padding for SA/POLICY.
Due to a typo, *pos already has 64-bit payload size, in a result next
memset(0) is called on the memory after the translated attribute, not on
the tail-padding of it.Fixes: 5106f4a8acff ("xfrm/compat: Add 32=>64-bit messages translator")
Reported-by: syzbot+c43831072e7df506a646@syzkaller.appspotmail.com
Signed-off-by: Dmitry Safonov
Signed-off-by: Steffen Klassert -
xfrm_xlate32() translates 64-bit message provided by kernel to be sent
for 32-bit listener (acknowledge or monitor). Translator code doesn't
expect XFRMA_UNSPEC attribute as it doesn't know its payload.
Kernel never attaches such attribute, but a user can.I've searched if any opensource does it and the answer is no.
Nothing on github and google finds only tfcproject that has such code
commented-out.What will happen if a user sends a netlink message with XFRMA_UNSPEC
attribute? Ipsec code ignores this attribute. But if there is a
monitor-process or 32-bit user requested ack - kernel will try to
translate such message and will hit WARN_ONCE() in xfrm_xlate64_attr().Deal with XFRMA_UNSPEC by copying the attribute payload with
xfrm_nla_cpy(). In result, the default switch-case in xfrm_xlate64_attr()
becomes an unused code. Leave those 3 lines in case a new xfrm attribute
will be added.Fixes: 5461fc0c8d9f ("xfrm/compat: Add 64=>32-bit messages translator")
Reported-by: syzbot+a7e701c8385bd8543074@syzkaller.appspotmail.com
Signed-off-by: Dmitry Safonov
Signed-off-by: Steffen Klassert
05 Nov, 2020
1 commit
-
Steffen Klassert says:
====================
1) Fix packet receiving of standard IP tunnels when the xfrm_interface
module is installed. From Xin Long.2) Fix a race condition between spi allocating and hash list
resizing. From zhuoliang zhang.
====================Signed-off-by: Jakub Kicinski
23 Oct, 2020
1 commit
-
we found that the following race condition exists in
xfrm_alloc_userspi flow:user thread state_hash_work thread
---- ----
xfrm_alloc_userspi()
__find_acq_core()
/*alloc new xfrm_state:x*/
xfrm_state_alloc()
/*schedule state_hash_work thread*/
xfrm_hash_grow_check() xfrm_hash_resize()
xfrm_alloc_spi /*hold lock*/
x->id.spi = htonl(spi) spin_lock_bh(&net->xfrm.xfrm_state_lock)
/*waiting lock release*/ xfrm_hash_transfer()
spin_lock_bh(&net->xfrm.xfrm_state_lock) /*add x into hlist:net->xfrm.state_byspi*/
hlist_add_head_rcu(&x->byspi)
spin_unlock_bh(&net->xfrm.xfrm_state_lock)/*add x into hlist:net->xfrm.state_byspi 2 times*/
hlist_add_head_rcu(&x->byspi)1. a new state x is alloced in xfrm_state_alloc() and added into the bydst hlist
in __find_acq_core() on the LHS;
2. on the RHS, state_hash_work thread travels the old bydst and tranfers every xfrm_state
(include x) into the new bydst hlist and new byspi hlist;
3. user thread on the LHS gets the lock and adds x into the new byspi hlist again.So the same xfrm_state (x) is added into the same list_hash
(net->xfrm.state_byspi) 2 times that makes the list_hash become
an inifite loop.To fix the race, x->id.spi = htonl(spi) in the xfrm_alloc_spi() is moved
to the back of spin_lock_bh, sothat state_hash_work thread no longer add x
which id.spi is zero into the hash_list.Fixes: f034b5d4efdf ("[XFRM]: Dynamic xfrm_state hash table sizing.")
Signed-off-by: zhuoliang zhang
Acked-by: Herbert Xu
Signed-off-by: Steffen Klassert
14 Oct, 2020
1 commit
-
Simplify the code by using new function dev_fetch_sw_netstats().
Signed-off-by: Heiner Kallweit
Link: https://lore.kernel.org/r/a6b816f4-bbf2-9db0-d59a-7e4e9cc808fe@gmail.com
Signed-off-by: Jakub Kicinski
09 Oct, 2020
1 commit
-
As Nicolas noticed in his case, when xfrm_interface module is installed
the standard IP tunnels will break in receiving packets.This is caused by the IP tunnel handlers with a higher priority in xfrm
interface processing incoming packets by xfrm_input(), which would drop
the packets and return 0 instead when anything wrong happens.Rather than changing xfrm_input(), this patch is to adjust the priority
for the IP tunnel handlers in xfrm interface, so that the packets would
go to xfrmi's later than the others', as the others' would not drop the
packets when the handlers couldn't process them.Note that IPCOMP also defines its own IPIP tunnel handler and it calls
xfrm_input() as well, so we must make its priority lower than xfrmi's,
which means having xfrmi loaded would still break IPCOMP. We may seek
another way to fix it in xfrm_input() in the future.Reported-by: Nicolas Dichtel
Tested-by: Nicolas Dichtel
Fixes: da9bbf0598c9 ("xfrm: interface: support IPIP and IPIP6 tunnels processing with .cb_handler")
FIxes: d7b360c2869f ("xfrm: interface: support IP6IP6 and IP6IP tunnels processing with .cb_handler")
Signed-off-by: Xin Long
Signed-off-by: Steffen Klassert
06 Oct, 2020
2 commits
-
use new helper for netstats settings
Signed-off-by: Fabian Frederick
Acked-by: Herbert Xu
Signed-off-by: David S. Miller -
Rejecting non-native endian BTF overlapped with the addition
of support for it.The rest were more simple overlapping changes, except the
renesas ravb binding update, which had to follow a file
move as well as a YAML conversion.Signed-off-by: David S. Miller
29 Sep, 2020
1 commit
-
Steffen Klassert says:
====================
pull request (net): ipsec 2020-09-281) Fix a build warning in ip_vti if CONFIG_IPV6 is not set.
From YueHaibing.2) Restore IPCB on espintcp before handing the packet to xfrm
as the information there is still needed.
From Sabrina Dubroca.3) Fix pmtu updating for xfrm interfaces.
From Sabrina Dubroca.4) Some xfrm state information was not cloned with xfrm_do_migrate.
Fixes to clone the full xfrm state, from Antony Antony.5) Use the correct address family in xfrm_state_find. The struct
flowi must always be interpreted along with the original
address family. This got lost over the years.
Fix from Herbert Xu.
====================Signed-off-by: David S. Miller
25 Sep, 2020
1 commit
-
The struct flowi must never be interpreted by itself as its size
depends on the address family. Therefore it must always be grouped
with its original family value.In this particular instance, the original family value is lost in
the function xfrm_state_find. Therefore we get a bogus read when
it's coupled with the wrong family which would occur with inter-
family xfrm states.This patch fixes it by keeping the original family value.
Note that the same bug could potentially occur in LSM through
the xfrm_state_pol_flow_match hook. I checked the current code
there and it seems to be safe for now as only secid is used which
is part of struct flowi_common. But that API should be changed
so that so that we don't get new bugs in the future. We could
do that by replacing fl with just secid or adding a family field.Reported-by: syzbot+577fbac3145a6eb2e7a5@syzkaller.appspotmail.com
Fixes: 48b8d78315bf ("[XFRM]: State selection update to use inner...")
Signed-off-by: Herbert Xu
Signed-off-by: Steffen Klassert
24 Sep, 2020
5 commits
-
Provide compat_xfrm_userpolicy_info translation for xfrm setsocketopt().
Reallocate buffer and put the missing padding for 64-bit message.Signed-off-by: Dmitry Safonov
Signed-off-by: Steffen Klassert -
Provide the user-to-kernel translator under XFRM_USER_COMPAT, that
creates for 32-bit xfrm-user message a 64-bit translation.
The translation is afterwards reused by xfrm_user code just as if
userspace had sent 64-bit message.Signed-off-by: Dmitry Safonov
Signed-off-by: Steffen Klassert -
Currently nlmsg_unicast() is used by functions that dump structures that
can be different in size for compat tasks, see dump_one_state() and
dump_one_policy().The following nlmsg_unicast() users exist today in xfrm:
Function | Message can be different
| in size on compat
-------------------------------------------|------------------------------
xfrm_get_spdinfo() | N
xfrm_get_sadinfo() | N
xfrm_get_sa() | Y
xfrm_alloc_userspi() | Y
xfrm_get_policy() | Y
xfrm_get_ae() | NBesides, dump_one_state() and dump_one_policy() can be used by filtered
netlink dump for XFRM_MSG_GETSA, XFRM_MSG_GETPOLICY.Just as for xfrm multicast, allocate frag_list for compat skb journey
down to recvmsg() which will give user the desired skb according to
syscall bitness.Signed-off-by: Dmitry Safonov
Signed-off-by: Steffen Klassert
-
Provide the kernel-to-user translator under XFRM_USER_COMPAT, that
creates for 64-bit xfrm-user message a 32-bit translation and puts it
in skb's frag_list. net/compat.c layer provides MSG_CMSG_COMPAT to
decide if the message should be taken from skb or frag_list.
(used by wext-core which has also an ABI difference)Kernel sends 64-bit xfrm messages to the userspace for:
- multicast (monitor events)
- netlink dumpsWire up the translator to xfrm_nlmsg_multicast().
Signed-off-by: Dmitry Safonov
Signed-off-by: Steffen Klassert -
Add a skeleton for xfrm_compat module and provide API to register it in
xfrm_state.ko. struct xfrm_translator will have function pointers to
translate messages received from 32-bit userspace or to be sent to it
from 64-bit kernel.
module_get()/module_put() are used instead of rcu_read_lock() as the
module will vmalloc() memory for translation.
The new API is registered with xfrm_state module, not with xfrm_user as
the former needs translator for user_policy set by setsockopt() and
xfrm_user already uses functions from xfrm_state.Signed-off-by: Dmitry Safonov
Signed-off-by: Steffen Klassert
07 Sep, 2020
3 commits
-
When we clone state only add_time was cloned. It missed values like
bytes, packets. Now clone the all members of the structure.v1->v3:
- use memcpy to copy the entire structureFixes: 80c9abaabf42 ("[XFRM]: Extension for dynamic update of endpoint address(es)")
Signed-off-by: Antony Antony
Signed-off-by: Steffen Klassert -
XFRMA_SEC_CTX was not cloned from the old to the new.
Migrate this attribute during XFRMA_MSG_MIGRATEv1->v2:
- return -ENOMEM on error
v2->v3:
- fix return type to intFixes: 80c9abaabf42 ("[XFRM]: Extension for dynamic update of endpoint address(es)")
Signed-off-by: Antony Antony
Signed-off-by: Steffen Klassert -
XFRMA_SET_MARK and XFRMA_SET_MARK_MASK was not cloned from the old
to the new. Migrate these two attributes during XFRMA_MSG_MIGRATEFixes: 9b42c1f179a6 ("xfrm: Extend the output_mark to support input direction and masking.")
Signed-off-by: Antony Antony
Signed-off-by: Steffen Klassert
27 Aug, 2020
1 commit
-
xfrm interfaces currently test for !skb->ignore_df when deciding
whether to update the pmtu on the skb's dst. Because of this, no pmtu
exception is created when we do something like:ping -s 1438
By dropping this check, the pmtu exception will be created and the
next ping attempt will work.Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
Reported-by: Xiumei Mu
Signed-off-by: Sabrina Dubroca
Signed-off-by: Steffen Klassert
24 Aug, 2020
1 commit
-
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
Signed-off-by: Gustavo A. R. Silva
17 Aug, 2020
1 commit
-
Xiumei reported a bug with espintcp over IPv6 in transport mode,
because xfrm6_transport_finish expects to find IP6CB data (struct
inet6_skb_cb). Currently, espintcp zeroes the CB, but the relevant
part is actually preserved by previous layers (first set up by tcp,
then strparser only zeroes a small part of tcp_skb_tb), so we can just
relocate it to the start of skb->cb.Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)")
Reported-by: Xiumei Mu
Signed-off-by: Sabrina Dubroca
Signed-off-by: Steffen Klassert
11 Aug, 2020
1 commit
-
Pull locking updates from Thomas Gleixner:
"A set of locking fixes and updates:- Untangle the header spaghetti which causes build failures in
various situations caused by the lockdep additions to seqcount to
validate that the write side critical sections are non-preemptible.- The seqcount associated lock debug addons which were blocked by the
above fallout.seqcount writers contrary to seqlock writers must be externally
serialized, which usually happens via locking - except for strict
per CPU seqcounts. As the lock is not part of the seqcount, lockdep
cannot validate that the lock is held.This new debug mechanism adds the concept of associated locks.
sequence count has now lock type variants and corresponding
initializers which take a pointer to the associated lock used for
writer serialization. If lockdep is enabled the pointer is stored
and write_seqcount_begin() has a lockdep assertion to validate that
the lock is held.Aside of the type and the initializer no other code changes are
required at the seqcount usage sites. The rest of the seqcount API
is unchanged and determines the type at compile time with the help
of _Generic which is possible now that the minimal GCC version has
been moved up.Adding this lockdep coverage unearthed a handful of seqcount bugs
which have been addressed already independent of this.While generally useful this comes with a Trojan Horse twist: On RT
kernels the write side critical section can become preemtible if
the writers are serialized by an associated lock, which leads to
the well known reader preempts writer livelock. RT prevents this by
storing the associated lock pointer independent of lockdep in the
seqcount and changing the reader side to block on the lock when a
reader detects that a writer is in the write side critical section.- Conversion of seqcount usage sites to associated types and
initializers"* tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
locking/seqlock, headers: Untangle the spaghetti monster
locking, arch/ia64: Reduce header dependencies by moving XTP bits into the new header
x86/headers: Remove APIC headers from
seqcount: More consistent seqprop names
seqcount: Compress SEQCNT_LOCKNAME_ZERO()
seqlock: Fold seqcount_LOCKNAME_init() definition
seqlock: Fold seqcount_LOCKNAME_t definition
seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g
hrtimer: Use sequence counter with associated raw spinlock
kvm/eventfd: Use sequence counter with associated spinlock
userfaultfd: Use sequence counter with associated spinlock
NFSv4: Use sequence counter with associated spinlock
iocost: Use sequence counter with associated spinlock
raid5: Use sequence counter with associated spinlock
vfs: Use sequence counter with associated spinlock
timekeeping: Use sequence counter with associated raw spinlock
xfrm: policy: Use sequence counters with associated lock
netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
netfilter: conntrack: Use sequence counter with associated spinlock
sched: tasks: Use sequence counter with associated spinlock
...
02 Aug, 2020
1 commit
-
Resolved kernel/bpf/btf.c using instructions from merge commit
69138b34a7248d2396ab85c8652e20c0c39beabaSigned-off-by: David S. Miller
01 Aug, 2020
1 commit
-
Steffen Klassert says:
====================
pull request (net): ipsec 2020-07-311) Fix policy matching with mark and mask on userspace interfaces.
From Xin Long.2) Several fixes for the new ESP in TCP encapsulation.
From Sabrina Dubroca.3) Fix crash when the hold queue is used. The assumption that
xdst->path and dst->child are not a NULL pointer only if dst->xfrm
is not a NULL pointer is true with the exception of using the
hold queue. Fix this by checking for hold queue usage before
dereferencing xdst->path or dst->child.4) Validate pfkey_dump parameter before sending them.
From Mark Salyzyn.5) Fix the location of the transport header with ESP in UDPv6
encapsulation. From Sabrina Dubroca.
====================Signed-off-by: David S. Miller
31 Jul, 2020
1 commit
-
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2020-07-30Please note that I did the first time now --no-ff merges
of my testing branch into the master branch to include
the [PATCH 0/n] message of a patchset. Please let me
know if this is desirable, or if I should do it any
different.1) Introduce a oseq-may-wrap flag to disable anti-replay
protection for manually distributed ICVs as suggested
in RFC 4303. From Petr Vaněk.2) Patchset to fully support IPCOMP for vti4, vti6 and
xfrm interfaces. From Xin Long.3) Switch from a linear list to a hash list for xfrm interface
lookups. From Eyal Birger.4) Fixes to not register one xfrm(6)_tunnel object twice.
From Xin Long.5) Fix two compile errors that were introduced with the
IPCOMP support for vti and xfrm interfaces.
Also from Xin Long.6) Make the policy hold queue work with VTI. This was
forgotten when VTI was implemented.Please pull or let me know if there are problems.
====================Signed-off-by: David S. Miller
30 Jul, 2020
2 commits
-
Currently, espintcp_rcv drops packets silently, which makes debugging
issues difficult. Count packets as either XfrmInHdrError (when the
packet was too short or contained invalid data) or XfrmInError (for
other issues).Signed-off-by: Sabrina Dubroca
Signed-off-by: Steffen Klassert -
Currently, short messages (less than 4 bytes after the length header)
will break the stream of messages. This is unnecessary, since we can
still parse messages even if they're too short to contain any usable
data. This is also bogus, as keepalive messages (a single 0xff byte),
though not needed with TCP encapsulation, should be allowed.This patch changes the stream parser so that short messages are
accepted and dropped in the kernel. Messages that contain a valid SPI
or non-ESP header are processed as before.Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)")
Reported-by: Andrew Cagney
Signed-off-by: Sabrina Dubroca
Signed-off-by: Steffen Klassert
29 Jul, 2020
1 commit
-
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the sequence counter write side critical
section.A plain seqcount_t does not contain the information of which lock must
be held when entering a write side critical section.Use the new seqcount_spinlock_t and seqcount_mutex_t data types instead,
which allow to associate a lock with the sequence counter. This enables
lockdep to verify that the lock used for writer serialization is held
when the write side critical section is entered.If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.Signed-off-by: Ahmed S. Darwish
Signed-off-by: Peter Zijlstra (Intel)
Link: https://lkml.kernel.org/r/20200720155530.1173732-17-a.darwish@linutronix.de
25 Jul, 2020
1 commit
-
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller
21 Jul, 2020
1 commit
-
We forgot to support the xfrm policy hold queue when
VTI was implemented. This patch adds everything we
need so that we can use the policy hold queue together
with VTI interfaces.Signed-off-by: Steffen Klassert
17 Jul, 2020
4 commits
-
kernel test robot reported some compile errors:
ia64-linux-ld: net/xfrm/xfrm_interface.o: in function `xfrmi4_fini':
net/xfrm/xfrm_interface.c:900: undefined reference to `xfrm4_tunnel_deregister'
ia64-linux-ld: net/xfrm/xfrm_interface.c:901: undefined reference to `xfrm4_tunnel_deregister'
ia64-linux-ld: net/xfrm/xfrm_interface.o: in function `xfrmi4_init':
net/xfrm/xfrm_interface.c:873: undefined reference to `xfrm4_tunnel_register'
ia64-linux-ld: net/xfrm/xfrm_interface.c:876: undefined reference to `xfrm4_tunnel_register'
ia64-linux-ld: net/xfrm/xfrm_interface.c:885: undefined reference to `xfrm4_tunnel_deregister'This happened when set CONFIG_XFRM_INTERFACE=y and CONFIG_INET_TUNNEL=m.
We don't really want xfrm_interface to depend inet_tunnel completely,
but only to disable the tunnel code when inet_tunnel is not seen.So instead of adding "select INET_TUNNEL" for XFRM_INTERFACE, this patch
is only to change to IS_REACHABLE to avoid these compile error.Reported-by: kernel test robot
Fixes: da9bbf0598c9 ("xfrm: interface: support IPIP and IPIP6 tunnels processing with .cb_handler")
Signed-off-by: Xin Long
Signed-off-by: Steffen Klassert -
In case we're compiling espintcp support only for IPv6, we should
still initialize the common code.Fixes: 26333c37fc28 ("xfrm: add IPv6 support for espintcp")
Signed-off-by: Sabrina Dubroca
Signed-off-by: Steffen Klassert -
man 2 recv says:
RETURN VALUE
When a stream socket peer has performed an orderly shutdown, the
return value will be 0 (the traditional "end-of-file" return).Currently, this works for blocking reads, but non-blocking reads will
return -EAGAIN. This patch overwrites that return value when the peer
won't send us any more data.Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)")
Reported-by: Andrew Cagney
Tested-by: Andrew Cagney
Signed-off-by: Sabrina Dubroca
Signed-off-by: Steffen Klassert -
Currently, non-blocking sends from userspace result in EOPNOTSUPP.
To support this, we need to tell espintcp_sendskb_locked() and
espintcp_sendskmsg_locked() that non-blocking operation was requested
from espintcp_sendmsg().Fixes: e27cca96cd68 ("xfrm: add espintcp (RFC 8229)")
Reported-by: Andrew Cagney
Tested-by: Andrew Cagney
Signed-off-by: Sabrina Dubroca
Signed-off-by: Steffen Klassert
14 Jul, 2020
1 commit
-
As we did in the last 2 patches for vti(6), this patch is to define a
new xfrm_tunnel object 'xfrmi_ipip6_handler' to register for AF_INET6,
and a new xfrm6_tunnel object 'xfrmi_ip6ip_handler' to register for
AF_INET.Signed-off-by: Xin Long
Signed-off-by: Steffen Klassert
13 Jul, 2020
2 commits
-
xfrmi_lookup() is called on every packet. Using a single list for
looking up if_id becomes a bottleneck when having many xfrm interfaces.Signed-off-by: Eyal Birger
Signed-off-by: Steffen Klassert -
The xfrmi context exists in the netdevice priv context.
Avoid looking for it in a separate list.Signed-off-by: Eyal Birger
Signed-off-by: Steffen Klassert
11 Jul, 2020
1 commit
-
All conflicts seemed rather trivial, with some guidance from
Saeed Mameed on the tc_ct.c one.Signed-off-by: David S. Miller