02 Oct, 2014
1 commit
-
Lets use a proper structure to clearly document and implement
skb fast clones.Then, we might experiment more easily alternative layouts.
This patch adds a new skb_fclone_busy() helper, used by tcp and xfrm,
to stop leaking of implementation details.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
29 Sep, 2014
1 commit
-
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2014-09-251) Remove useless hash_resize_mutex in xfrm_hash_resize().
This mutex is used only there, but xfrm_hash_resize()
can't be called concurrently at all. From Ying Xue.2) Extend policy hashing to prefixed policies based on
prefix lenght thresholds. From Christophe Gouault.3) Make the policy hash table thresholds configurable
via netlink. From Christophe Gouault.4) Remove the maximum authentication length for AH.
This was needed to limit stack usage. We switched
already to allocate space, so no need to keep the
limit. From Herbert Xu.
====================Signed-off-by: David S. Miller
24 Sep, 2014
1 commit
-
Conflicts:
arch/mips/net/bpf_jit.c
drivers/net/can/flexcan.cBoth the flexcan and MIPS bpf_jit conflicts were cases of simple
overlapping changes.Signed-off-by: David S. Miller
18 Sep, 2014
1 commit
-
While tracking down the MAX_AH_AUTH_LEN crash in an old kernel
I thought that this limit was rather arbitrary and we should
just get rid of it.In fact it seems that we've already done all the work needed
to remove it apart from actually removing it. This limit was
there in order to limit stack usage. Since we've already
switched over to allocating scratch space using kmalloc, there
is no longer any need to limit the authentication length.This patch kills all references to it, including the BUG_ONs
that led me here.Signed-off-by: Herbert Xu
Signed-off-by: Steffen Klassert
16 Sep, 2014
2 commits
-
Currently we genarate a queueing route if we have matching policies
but can not resolve the states and the sysctl xfrm_larval_drop is
disabled. Here we assume that dst_output() is called to kill the
queued packets. Unfortunately this assumption is not true in all
cases, so it is possible that these packets leave the system unwanted.We fix this by generating queueing routes only from the
route lookup functions, here we can guarantee a call to
dst_output() afterwards.Fixes: a0073fe18e71 ("xfrm: Add a state resolution packet queue")
Reported-by: Konstantinos Kolelis
Signed-off-by: Steffen Klassert -
Currently we genarate a blackhole route route whenever we have
matching policies but can not resolve the states. Here we assume
that dst_output() is called to kill the balckholed packets.
Unfortunately this assumption is not true in all cases, so
it is possible that these packets leave the system unwanted.We fix this by generating blackhole routes only from the
route lookup functions, here we can guarantee a call to
dst_output() afterwards.Fixes: 2774c131b1d ("xfrm: Handle blackhole route creation via afinfo.")
Reported-by: Konstantinos Kolelis
Signed-off-by: Steffen Klassert
10 Sep, 2014
1 commit
-
Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller
02 Sep, 2014
2 commits
-
Enable to specify local and remote prefix length thresholds for the
policy hash table via a netlink XFRM_MSG_NEWSPDINFO message.prefix length thresholds are specified by XFRMA_SPD_IPV4_HTHRESH and
XFRMA_SPD_IPV6_HTHRESH optional attributes (struct xfrmu_spdhthresh).example:
struct xfrmu_spdhthresh thresh4 = {
.lbits = 0;
.rbits = 24;
};
struct xfrmu_spdhthresh thresh6 = {
.lbits = 0;
.rbits = 56;
};
struct nlmsghdr *hdr;
struct nl_msg *msg;msg = nlmsg_alloc();
hdr = nlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, XFRMA_SPD_IPV4_HTHRESH, sizeof(__u32), NLM_F_REQUEST);
nla_put(msg, XFRMA_SPD_IPV4_HTHRESH, sizeof(thresh4), &thresh4);
nla_put(msg, XFRMA_SPD_IPV6_HTHRESH, sizeof(thresh6), &thresh6);
nla_send_auto(sk, msg);The numbers are the policy selector minimum prefix lengths to put a
policy in the hash table.- lbits is the local threshold (source address for out policies,
destination address for in and fwd policies).- rbits is the remote threshold (destination address for out
policies, source address for in and fwd policies).The default values are:
XFRMA_SPD_IPV4_HTHRESH: 32 32
XFRMA_SPD_IPV6_HTHRESH: 128 128Dynamic re-building of the SPD is performed when the thresholds values
are changed.The current thresholds can be read via a XFRM_MSG_GETSPDINFO request:
the kernel replies to XFRM_MSG_GETSPDINFO requests by an
XFRM_MSG_NEWSPDINFO message, with both attributes
XFRMA_SPD_IPV4_HTHRESH and XFRMA_SPD_IPV6_HTHRESH.Signed-off-by: Christophe Gouault
Signed-off-by: Steffen Klassert -
The idea is an extension of the current policy hashing.
Today only non-prefixed policies are stored in a hash table. This
patch relaxes the constraints, and hashes policies whose prefix
lengths are greater or equal to a configurable threshold.Each hash table (one per direction) maintains its own set of IPv4 and
IPv6 thresholds (dbits4, sbits4, dbits6, sbits6), by default (32, 32,
128, 128).Example, if the output hash table is configured with values (16, 24,
56, 64):ip xfrm policy add dir out src 10.22.0.0/20 dst 10.24.1.0/24 ... => hashed
ip xfrm policy add dir out src 10.22.0.0/16 dst 10.24.1.1/32 ... => hashed
ip xfrm policy add dir out src 10.22.0.0/16 dst 10.24.0.0/16 ... => unhashedip xfrm policy add dir out \
src 3ffe:304:124:2200::/60 dst 3ffe:304:124:2401::/64 ... => hashed
ip xfrm policy add dir out \
src 3ffe:304:124:2200::/56 dst 3ffe:304:124:2401::2/128 ... => hashed
ip xfrm policy add dir out \
src 3ffe:304:124:2200::/56 dst 3ffe:304:124:2400::/56 ... => unhashedThe high order bits of the addresses (up to the threshold) are used to
compute the hash key.Signed-off-by: Christophe Gouault
Signed-off-by: Steffen Klassert
29 Aug, 2014
1 commit
-
In xfrm_state.c, hash_resize_mutex is defined as a local variable
and only used in xfrm_hash_resize() which is declared as a work
handler of xfrm.state_hash_work. But when the xfrm.state_hash_work
work is put in the global workqueue(system_wq) with schedule_work(),
the work will be really inserted in the global workqueue if it was
not already queued, otherwise, it is still left in the same position
on the the global workqueue. This means the xfrm_hash_resize() work
handler is only executed once at any time no matter how many times
its work is scheduled, that is, xfrm_hash_resize() is not called
concurrently at all, so hash_resize_mutex is redundant for us.Cc: Christophe Gouault
Cc: Steffen Klassert
Signed-off-by: Ying Xue
Acked-by: David S. Miller
Signed-off-by: Steffen Klassert
07 Aug, 2014
1 commit
-
All other add functions for lists have the new item as first argument
and the position where it is added as second argument. This was changed
for no good reason in this function and makes using it unnecessary
confusing.The name was changed to hlist_add_behind() to cause unconverted code to
generate a compile error instead of using the wrong parameter order.[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Ken Helias
Cc: "Paul E. McKenney"
Acked-by: Jeff Kirsher [intel driver bits]
Cc: Hugh Dickins
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
30 Jun, 2014
1 commit
-
The SPI check introduced in ea9884b3acf3311c8a11db67bfab21773f6f82ba
was intended for IPComp SAs but actually prevented AH SAs from getting
installed (depending on the SPI).Fixes: ea9884b3acf3 ("xfrm: check user specified spi for IPComp")
Cc: Fan Du
Signed-off-by: Tobias Brunner
Signed-off-by: Steffen Klassert
26 Jun, 2014
1 commit
-
xfrm_lookup must return a dst_entry with a refcount for the caller.
Git commit 1a1ccc96abb ("xfrm: Remove caching of xfrm_policy_sk_bundles")
removed this refcount for the socket policy case accidentally.
This patch restores it and sets DST_NOCACHE flag to make sure
that the dst_entry is freed when the refcount becomes null.Fixes: 1a1ccc96abb ("xfrm: Remove caching of xfrm_policy_sk_bundles")
Signed-off-by: Steffen Klassert
04 Jun, 2014
2 commits
-
Conflicts:
include/net/inetpeer.h
net/ipv6/output_core.cChanges in net were fixing bugs in code removed in net-next.
Signed-off-by: David S. Miller
-
The xfrm_user module registers its pernet init/exit after xfrm
itself so that its net exit function xfrm_user_net_exit() is
executed before xfrm_net_exit() which calls xfrm_state_fini() to
cleanup the SA's (xfrm states). This opens a window between
zeroing net->xfrm.nlsk pointer and deleting all xfrm_state
instances which may access it (via the timer). If an xfrm state
expires in this window, xfrm_exp_state_notify() will pass null
pointer as socket to nlmsg_multicast().As the notifications are called inside rcu_read_lock() block, it
is sufficient to retrieve the nlsk socket with rcu_dereference()
and check the it for null.Signed-off-by: Michal Kubecek
Signed-off-by: David S. Miller
23 May, 2014
1 commit
-
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2014-05-22This is the last ipsec pull request before I leave for
a three weeks vacation tomorrow. David, can you please
take urgent ipsec patches directly into net/net-next
during this time?I'll continue to run the ipsec/ipsec-next trees as soon
as I'm back.1) Simplify the xfrm audit handling, from Tetsuo Handa.
2) Codingstyle cleanup for xfrm_output, from abian Frederick.
Please pull or let me know if there are problems.
====================Signed-off-by: David S. Miller
13 May, 2014
2 commits
-
Fix checkpatch warning:
"WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable"Cc: Steffen Klassert
Cc: Herbert Xu
Cc: Andrew Morton
Signed-off-by: Fabian Frederick
Signed-off-by: Steffen Klassert -
Conflicts:
drivers/net/ethernet/altera/altera_sgdma.c
net/netlink/af_netlink.c
net/sched/cls_api.c
net/sched/sch_api.cThe netlink conflict dealt with moving to netlink_capable() and
netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations
in non-init namespaces. These were simple transformations from
netlink_capable to netlink_ns_capable.The Altera driver conflict was simply code removal overlapping some
void pointer cast cleanups in net-next.Signed-off-by: David S. Miller
08 May, 2014
1 commit
-
commit 8f0ea0fe3a036a47767f9c80e (snmp: reduce percpu needs by 50%)
reduced snmp array size to 1, so technically it doesn't have to be
an array any more. What's more, after the following commit:commit 933393f58fef9963eac61db8093689544e29a600
Date: Thu Dec 22 11:58:51 2011 -0600percpu: Remove irqsafe_cpu_xxx variants
We simply say that regular this_cpu use must be safe regardless of
preemption and interrupt state. That has no material change for x86
and s390 implementations of this_cpu operations. However, arches that
do not provide their own implementation for this_cpu operations will
now get code generated that disables interrupts instead of preemption.probably no arch wants to have SNMP_ARRAY_SZ == 2. At least after
almost 3 years, no one complains.So, just convert the array to a single pointer and remove snmp_mib_init()
and snmp_mib_free() as well.Cc: Christoph Lameter
Cc: Eric Dumazet
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
25 Apr, 2014
1 commit
-
It is possible by passing a netlink socket to a more privileged
executable and then to fool that executable into writing to the socket
data that happens to be valid netlink message to do something that
privileged executable did not intend to do.To keep this from happening replace bare capable and ns_capable calls
with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
Which act the same as the previous calls except they verify that the
opener of the socket had the desired permissions as well.Reported-by: Andy Lutomirski
Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller
23 Apr, 2014
1 commit
-
Commit f1370cc4 "xfrm: Remove useless secid field from xfrm_audit." changed
"struct xfrm_audit" to have either
{ audit_get_loginuid(current) / audit_get_sessionid(current) } or
{ INVALID_UID / -1 } pair.This means that we can represent "struct xfrm_audit" as "bool".
This patch replaces "struct xfrm_audit" argument with "bool".Signed-off-by: Tetsuo Handa
Signed-off-by: Steffen Klassert
22 Apr, 2014
1 commit
-
It seems to me that commit ab5f5e8b "[XFRM]: xfrm audit calls" is doing
something strange at xfrm_audit_helper_usrinfo().
If secid != 0 && security_secid_to_secctx(secid) != 0, the caller calls
audit_log_task_context() which basically does
secid != 0 && security_secid_to_secctx(secid) == 0 case
except that secid is obtained from current thread's context.Oh, what happens if secid passed to xfrm_audit_helper_usrinfo() was
obtained from other thread's context? It might audit current thread's
context rather than other thread's context if security_secid_to_secctx()
in xfrm_audit_helper_usrinfo() failed for some reason.Then, are all the caller of xfrm_audit_helper_usrinfo() passing either
secid obtained from current thread's context or secid == 0?
It seems to me that they are.If I didn't miss something, we don't need to pass secid to
xfrm_audit_helper_usrinfo() because audit_log_task_context() will
obtain secid from current thread's context.Signed-off-by: Tetsuo Handa
Signed-off-by: Steffen Klassert
16 Apr, 2014
1 commit
-
In the dst->output() path for ipv4, the code assumes the skb it has to
transmit is attached to an inet socket, specifically via
ip_mc_output() : The sk_mc_loop() test triggers a WARN_ON() when the
provider of the packet is an AF_PACKET socket.The dst->output() method gets an additional 'struct sock *sk'
parameter. This needs a cascade of changes so that this parameter can
be propagated from vxlan to final consumer.Fixes: 8f646c922d55 ("vxlan: keep original skb ownership")
Reported-by: lucien xin
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
26 Mar, 2014
1 commit
-
Conflicts:
Documentation/devicetree/bindings/net/micrel-ks8851.txt
net/core/netpoll.cThe net/core/netpoll.c conflict is a bug fix in 'net' happening
to code which is completely removed in 'net-next'.In micrel-ks8851.txt we simply have overlapping changes.
Signed-off-by: David S. Miller
19 Mar, 2014
1 commit
-
Steffen Klassert says:
====================
One patch to rename a newly introduced struct. The rest is
the rework of the IPsec virtual tunnel interface for ipv6 to
support inter address family tunneling and namespace crossing.1) Rename the newly introduced struct xfrm_filter to avoid a
conflict with iproute2. From Nicolas Dichtel.2) Introduce xfrm_input_afinfo to access the address family
dependent tunnel callback functions properly.3) Add and use a IPsec protocol multiplexer for ipv6.
4) Remove dst_entry caching. vti can lookup multiple different
dst entries, dependent of the configured xfrm states. Therefore
it does not make to cache a dst_entry.5) Remove caching of flow informations. vti6 does not use the the
tunnel endpoint addresses to do route and xfrm lookups.6) Update the vti6 to use its own receive hook.
7) Remove the now unused xfrm_tunnel_notifier. This was used from vti
and is replaced by the IPsec protocol multiplexer hooks.8) Support inter address family tunneling for vti6.
9) Check if the tunnel endpoints of the xfrm state and the vti interface
are matching and return an error otherwise.10) Enable namespace crossing for vti devices.
====================Signed-off-by: David S. Miller
14 Mar, 2014
1 commit
-
IPv6 can be build as a module, so we need mechanism to access
the address family dependent callback functions properly.
Therefore we introduce xfrm_input_afinfo, similar to that
what we have for the address family dependent part of
policies and states.Signed-off-by: Steffen Klassert
13 Mar, 2014
1 commit
-
We leak an active timer, the hotcpu notifier and all allocated
resources when we exit a namespace. Fix this by introducing a
flow_cache_fini() function where we release the resources before
we exit.Fixes: ca925cf1534e ("flowcache: Make flow cache name space aware")
Reported-by: Jakub Kicinski
Tested-by: Jakub Kicinski
Cc: Eric Dumazet
Cc: Fan Du
Signed-off-by: Steffen Klassert
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
10 Mar, 2014
1 commit
-
security_xfrm_policy_alloc can be called in atomic context so the
allocation should be done with GFP_ATOMIC. Add an argument to let the
callers choose the appropriate way. In order to do so a gfp argument
needs to be added to the method xfrm_policy_alloc_security in struct
security_operations and to the internal function
selinux_xfrm_alloc_user. After that switch to GFP_ATOMIC in the atomic
callers and leave GFP_KERNEL as before for the rest.
The path that needed the gfp argument addition is:
security_xfrm_policy_alloc -> security_ops.xfrm_policy_alloc_security ->
all users of xfrm_policy_alloc_security (e.g. selinux_xfrm_policy_alloc) ->
selinux_xfrm_alloc_user (here the allocation used to be GFP_KERNEL only)Now adding a gfp argument to selinux_xfrm_alloc_user requires us to also
add it to security_context_to_sid which is used inside and prior to this
patch did only GFP_KERNEL allocation. So add gfp argument to
security_context_to_sid and adjust all of its callers as well.CC: Paul Moore
CC: Dave Jones
CC: Steffen Klassert
CC: Fan Du
CC: David S. Miller
CC: LSM list
CC: SELinux listSigned-off-by: Nikolay Aleksandrov
Acked-by: Paul Moore
Signed-off-by: Steffen Klassert
07 Mar, 2014
1 commit
-
iproute2 already defines a structure with that name, let's use another one to
avoid any conflict.CC: Stephen Hemminger
Signed-off-by: Nicolas Dichtel
Signed-off-by: Steffen Klassert
06 Mar, 2014
1 commit
-
Conflicts:
drivers/net/wireless/ath/ath9k/recv.c
drivers/net/wireless/mwifiex/pcie.c
net/ipv6/sit.cThe SIT driver conflict consists of a bug fix being done by hand
in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper
was created (netdev_alloc_pcpu_stats()) which takes care of this.The two wireless conflicts were overlapping changes.
Signed-off-by: David S. Miller
26 Feb, 2014
1 commit
-
When a policy is unlinked from the lists in thread context,
the xfrm timer can fire before we can mark this policy as dead.
So reinitialize the bydst hlist, then hlist_unhashed() will
notice that this policy is not linked and will avoid a
doulble unlink of that policy.Reported-by: Xianpeng Zhao
Signed-off-by: Steffen Klassert
25 Feb, 2014
2 commits
-
IPsec vti_rcv needs to remind the tunnel pointer to
check it later at the vti_rcv_cb callback. So add
this pointer to the IPsec common buffer, initialize
it and check it to avoid transport state matching of
a tunneled packet.Signed-off-by: Steffen Klassert
-
This patch add an IPsec protocol multiplexer. With this
it is possible to add alternative protocol handlers as
needed for IPsec virtual tunnel interfaces.Signed-off-by: Steffen Klassert
21 Feb, 2014
1 commit
-
The error pointer passed to xfrm_state_clone() is unchecked,
so remove it and indicate an error by returning a null pointer.Signed-off-by: Steffen Klassert
20 Feb, 2014
3 commits
-
We loose a lot of information of the original state if we
clone it with xfrm_state_clone(). In particular, there is
no crypto algorithm attached if the original state uses
an aead algorithm. This patch add the missing information
to the clone state.Signed-off-by: Steffen Klassert
-
A comment on xfrm_migrate_state_find() says that xfrm_state_lock
is held. This is apparently not the case, but we need it to
traverse through the state lists.Signed-off-by: Steffen Klassert
-
xfrm_state_sort() takes the unsorted states from the src array
and stores them into the dst array. We try to get the namespace
from the dst array which is empty at this time, so take the
namespace from the src array instead.Fixes: 283bc9f35bbbc ("xfrm: Namespacify xfrm state/policy locks")
Signed-off-by: Steffen Klassert
19 Feb, 2014
1 commit
-
We currently cache socket policy bundles at xfrm_policy_sk_bundles.
These cached bundles are never used. Instead we create and cache
a new one whenever xfrm_lookup() is called on a socket policy.Most protocols cache the used routes to the socket, so let's
remove the unused caching of socket policy bundles in xfrm.Signed-off-by: Steffen Klassert
17 Feb, 2014
1 commit
-
The goal of this patch is to allow userland to dump only a part of SA by
specifying a filter during the dump.
The kernel is in charge to filter SA, this avoids to generate useless netlink
traffic (it save also some cpu cycles). This is particularly useful when there
is a big number of SA set on the system.Note that I removed the union in struct xfrm_state_walk to fix a problem on arm.
struct netlink_callback->args is defined as a array of 6 long and the first long
is used in xfrm code to flag the cb as initialized. Hence, we must have:
sizeof(struct xfrm_state_walk)
Signed-off-by: Steffen Klassert
13 Feb, 2014
1 commit
-
In the case when KMs have no listeners, km_query() will fail and
temporary SAs are garbage collected immediately after their allocation.
This causes strain on memory allocation, leading even to OOM since
temporary SA alloc/free cycle is performed for every packet
and garbage collection does not keep up the pace.The sane thing to do is to make sure we have audience before
temporary SA allocation.Signed-off-by: Horia Geanta
Signed-off-by: Steffen Klassert