23 Oct, 2015

1 commit


25 Aug, 2015

1 commit

  • Hit the following splat testing VRF change for ipsec:

    [ 113.475692] ===============================
    [ 113.476194] [ INFO: suspicious RCU usage. ]
    [ 113.476667] 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED Not tainted
    [ 113.477545] -------------------------------
    [ 113.478013] /work/monster-14/dsa/kernel.git/include/linux/rcupdate.h:568 Illegal context switch in RCU read-side critical section!
    [ 113.479288]
    [ 113.479288] other info that might help us debug this:
    [ 113.479288]
    [ 113.480207]
    [ 113.480207] rcu_scheduler_active = 1, debug_locks = 1
    [ 113.480931] 2 locks held by setkey/6829:
    [ 113.481371] #0: (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [] pfkey_sendmsg+0xfb/0x213
    [ 113.482509] #1: (rcu_read_lock){......}, at: [] rcu_read_lock+0x0/0x6e
    [ 113.483509]
    [ 113.483509] stack backtrace:
    [ 113.484041] CPU: 0 PID: 6829 Comm: setkey Not tainted 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED
    [ 113.485422] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
    [ 113.486845] 0000000000000001 ffff88001d4c7a98 ffffffff81518af2 ffffffff81086962
    [ 113.487732] ffff88001d538480 ffff88001d4c7ac8 ffffffff8107ae75 ffffffff8180a154
    [ 113.488628] 0000000000000b30 0000000000000000 00000000000000d0 ffff88001d4c7ad8
    [ 113.489525] Call Trace:
    [ 113.489813] [] dump_stack+0x4c/0x65
    [ 113.490389] [] ? console_unlock+0x3d6/0x405
    [ 113.491039] [] lockdep_rcu_suspicious+0xfa/0x103
    [ 113.491735] [] rcu_preempt_sleep_check+0x45/0x47
    [ 113.492442] [] ___might_sleep+0x19/0x1c8
    [ 113.493077] [] __might_sleep+0x6c/0x82
    [ 113.493681] [] cache_alloc_debugcheck_before.isra.50+0x1d/0x24
    [ 113.494508] [] kmem_cache_alloc+0x31/0x18f
    [ 113.495149] [] skb_clone+0x64/0x80
    [ 113.495712] [] pfkey_broadcast_one+0x3d/0xff
    [ 113.496380] [] pfkey_broadcast+0xb5/0x11e
    [ 113.497024] [] pfkey_register+0x191/0x1b1
    [ 113.497653] [] pfkey_process+0x162/0x17e
    [ 113.498274] [] pfkey_sendmsg+0x109/0x213

    In pfkey_sendmsg the net mutex is taken and then pfkey_broadcast takes
    the RCU lock.

    Since pfkey_broadcast takes the RCU lock the allocation argument is
    pointless since GFP_ATOMIC must be used between the rcu_read_{,un}lock.
    The one call outside of rcu can be done with GFP_KERNEL.

    Fixes: 7f6b9dbd5afbd ("af_key: locking change")
    Signed-off-by: David Ahern
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    David Ahern
     

25 Jun, 2015

1 commit

  • Pull networking updates from David Miller:

    1) Add TX fast path in mac80211, from Johannes Berg.

    2) Add TSO/GRO support to ibmveth, from Thomas Falcon

    3) Move away from cached routes in ipv6, just like ipv4, from Martin
    KaFai Lau.

    4) Lots of new rhashtable tests, from Thomas Graf.

    5) Run ingress qdisc lockless, from Alexei Starovoitov.

    6) Allow servers to fetch TCP packet headers for SYN packets of new
    connections, for fingerprinting. From Eric Dumazet.

    7) Add mode parameter to pktgen, for testing receive. From Alexei
    Starovoitov.

    8) Cache access optimizations via simplifications of build_skb(), from
    Alexander Duyck.

    9) Move page frag allocator under mm/, also from Alexander.

    10) Add xmit_more support to hv_netvsc, from KY Srinivasan.

    11) Add a counter guard in case we try to perform endless reclassify
    loops in the packet scheduler.

    12) Extern flow dissector to be programmable and use it in new "Flower"
    classifier. From Jiri Pirko.

    13) AF_PACKET fanout rollover fixes, performance improvements, and new
    statistics. From Willem de Bruijn.

    14) Add netdev driver for GENEVE tunnels, from John W Linville.

    15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso.

    16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet.

    17) Add an ECN retry fallback for the initial TCP handshake, from Daniel
    Borkmann.

    18) Add tail call support to BPF, from Alexei Starovoitov.

    19) Add several pktgen helper scripts, from Jesper Dangaard Brouer.

    20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa.

    21) Favor even port numbers for allocation to connect() requests, and
    odd port numbers for bind(0), in an effort to help avoid
    ip_local_port_range exhaustion. From Eric Dumazet.

    22) Add Cavium ThunderX driver, from Sunil Goutham.

    23) Allow bpf programs to access skb_iif and dev->ifindex SKB metadata,
    from Alexei Starovoitov.

    24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai.

    25) Double TCP Small Queues default to 256K to accomodate situations
    like the XEN driver and wireless aggregation. From Wei Liu.

    26) Add more entropy inputs to flow dissector, from Tom Herbert.

    27) Add CDG congestion control algorithm to TCP, from Kenneth Klette
    Jonassen.

    28) Convert ipset over to RCU locking, from Jozsef Kadlecsik.

    29) Track and act upon link status of ipv4 route nexthops, from Andy
    Gospodarek.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits)
    bridge: vlan: flush the dynamically learned entries on port vlan delete
    bridge: multicast: add a comment to br_port_state_selection about blocking state
    net: inet_diag: export IPV6_V6ONLY sockopt
    stmmac: troubleshoot unexpected bits in des0 & des1
    net: ipv4 sysctl option to ignore routes when nexthop link is down
    net: track link-status of ipv4 nexthops
    net: switchdev: ignore unsupported bridge flags
    net: Cavium: Fix MAC address setting in shutdown state
    drivers: net: xgene: fix for ACPI support without ACPI
    ip: report the original address of ICMP messages
    net/mlx5e: Prefetch skb data on RX
    net/mlx5e: Pop cq outside mlx5e_get_cqe
    net/mlx5e: Remove mlx5e_cq.sqrq back-pointer
    net/mlx5e: Remove extra spaces
    net/mlx5e: Avoid TX CQE generation if more xmit packets expected
    net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion
    net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq()
    net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them
    net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues
    net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device
    ...

    Linus Torvalds
     

28 May, 2015

1 commit


11 May, 2015

1 commit


01 Apr, 2015

1 commit

  • In many places, the a6 field is typecasted to struct in6_addr. As the
    fields are in union anyway, just add in6_addr type to the union and
    get rid of the typecasting.

    Modifying the uapi header is okay, the union has still the same size.

    Signed-off-by: Jiri Benc
    Signed-off-by: David S. Miller

    Jiri Benc
     

03 Mar, 2015

1 commit

  • After TIPC doesn't depend on iocb argument in its internal
    implementations of sendmsg() and recvmsg() hooks defined in proto
    structure, no any user is using iocb argument in them at all now.
    Then we can drop the redundant iocb argument completely from kinds of
    implementations of both sendmsg() and recvmsg() in the entire
    networking stack.

    Cc: Christoph Hellwig
    Suggested-by: Al Viro
    Signed-off-by: Ying Xue
    Signed-off-by: David S. Miller

    Ying Xue
     

24 Nov, 2014

1 commit


06 Nov, 2014

1 commit

  • This encapsulates all of the skb_copy_datagram_iovec() callers
    with call argument signature "skb, offset, msghdr->msg_iov, length".

    When we move to iov_iters in the networking, the iov_iter object will
    sit in the msghdr.

    Having a helper like this means there will be less places to touch
    during that transformation.

    Based upon descriptions and patch from Al Viro.

    Signed-off-by: David S. Miller

    David S. Miller
     

16 Jul, 2014

1 commit


31 May, 2014

1 commit

  • This patch replaces a comma between expression statements by a semicolon.

    A simplified version of the semantic patch that performs this
    transformation is as follows:

    //
    @r@
    expression e1,e2,e;
    type T;
    identifier i;
    @@

    e1
    -,
    +;
    e2;
    //

    Signed-off-by: Himangi Saraogi
    Acked-by: Julia Lawall
    Signed-off-by: David S. Miller

    Himangi Saraogi
     

23 Apr, 2014

1 commit

  • Commit f1370cc4 "xfrm: Remove useless secid field from xfrm_audit." changed
    "struct xfrm_audit" to have either
    { audit_get_loginuid(current) / audit_get_sessionid(current) } or
    { INVALID_UID / -1 } pair.

    This means that we can represent "struct xfrm_audit" as "bool".
    This patch replaces "struct xfrm_audit" argument with "bool".

    Signed-off-by: Tetsuo Handa
    Signed-off-by: Steffen Klassert

    Tetsuo Handa
     

22 Apr, 2014

1 commit

  • It seems to me that commit ab5f5e8b "[XFRM]: xfrm audit calls" is doing
    something strange at xfrm_audit_helper_usrinfo().
    If secid != 0 && security_secid_to_secctx(secid) != 0, the caller calls
    audit_log_task_context() which basically does
    secid != 0 && security_secid_to_secctx(secid) == 0 case
    except that secid is obtained from current thread's context.

    Oh, what happens if secid passed to xfrm_audit_helper_usrinfo() was
    obtained from other thread's context? It might audit current thread's
    context rather than other thread's context if security_secid_to_secctx()
    in xfrm_audit_helper_usrinfo() failed for some reason.

    Then, are all the caller of xfrm_audit_helper_usrinfo() passing either
    secid obtained from current thread's context or secid == 0?
    It seems to me that they are.

    If I didn't miss something, we don't need to pass secid to
    xfrm_audit_helper_usrinfo() because audit_log_task_context() will
    obtain secid from current thread's context.

    Signed-off-by: Tetsuo Handa
    Signed-off-by: Steffen Klassert

    Tetsuo Handa
     

12 Apr, 2014

1 commit

  • Several spots in the kernel perform a sequence like:

    skb_queue_tail(&sk->s_receive_queue, skb);
    sk->sk_data_ready(sk, skb->len);

    But at the moment we place the SKB onto the socket receive queue it
    can be consumed and freed up. So this skb->len access is potentially
    to freed up memory.

    Furthermore, the skb->len can be modified by the consumer so it is
    possible that the value isn't accurate.

    And finally, no actual implementation of this callback actually uses
    the length argument. And since nobody actually cared about it's
    value, lots of call sites pass arbitrary values in such as '0' and
    even '1'.

    So just remove the length argument from the callback, that way there
    is no confusion whatsoever and all of these use-after-free cases get
    fixed as a side effect.

    Based upon a patch by Eric Dumazet and his suggestion to audit this
    issue tree-wide.

    Signed-off-by: David S. Miller

    David S. Miller
     

26 Mar, 2014

1 commit


10 Mar, 2014

2 commits

  • security_xfrm_policy_alloc can be called in atomic context so the
    allocation should be done with GFP_ATOMIC. Add an argument to let the
    callers choose the appropriate way. In order to do so a gfp argument
    needs to be added to the method xfrm_policy_alloc_security in struct
    security_operations and to the internal function
    selinux_xfrm_alloc_user. After that switch to GFP_ATOMIC in the atomic
    callers and leave GFP_KERNEL as before for the rest.
    The path that needed the gfp argument addition is:
    security_xfrm_policy_alloc -> security_ops.xfrm_policy_alloc_security ->
    all users of xfrm_policy_alloc_security (e.g. selinux_xfrm_policy_alloc) ->
    selinux_xfrm_alloc_user (here the allocation used to be GFP_KERNEL only)

    Now adding a gfp argument to selinux_xfrm_alloc_user requires us to also
    add it to security_context_to_sid which is used inside and prior to this
    patch did only GFP_KERNEL allocation. So add gfp argument to
    security_context_to_sid and adjust all of its callers as well.

    CC: Paul Moore
    CC: Dave Jones
    CC: Steffen Klassert
    CC: Fan Du
    CC: David S. Miller
    CC: LSM list
    CC: SELinux list

    Signed-off-by: Nikolay Aleksandrov
    Acked-by: Paul Moore
    Signed-off-by: Steffen Klassert

    Nikolay Aleksandrov
     
  • There's a kmalloc with GFP_KERNEL in a helper
    (pfkey_sadb2xfrm_user_sec_ctx) used in pfkey_compile_policy which is
    called under rcu_read_lock. Adjust pfkey_sadb2xfrm_user_sec_ctx to have
    a gfp argument and adjust the users.

    CC: Dave Jones
    CC: Steffen Klassert
    CC: Fan Du
    CC: David S. Miller

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: Steffen Klassert

    Nikolay Aleksandrov
     

07 Mar, 2014

1 commit


21 Feb, 2014

1 commit


17 Feb, 2014

1 commit

  • The goal of this patch is to allow userland to dump only a part of SA by
    specifying a filter during the dump.
    The kernel is in charge to filter SA, this avoids to generate useless netlink
    traffic (it save also some cpu cycles). This is particularly useful when there
    is a big number of SA set on the system.

    Note that I removed the union in struct xfrm_state_walk to fix a problem on arm.
    struct netlink_callback->args is defined as a array of 6 long and the first long
    is used in xfrm code to flag the cb as initialized. Hence, we must have:
    sizeof(struct xfrm_state_walk)
    Signed-off-by: Steffen Klassert

    Nicolas Dichtel
     

13 Feb, 2014

1 commit

  • In the case when KMs have no listeners, km_query() will fail and
    temporary SAs are garbage collected immediately after their allocation.
    This causes strain on memory allocation, leading even to OOM since
    temporary SA alloc/free cycle is performed for every packet
    and garbage collection does not keep up the pace.

    The sane thing to do is to make sure we have audience before
    temporary SA allocation.

    Signed-off-by: Horia Geanta
    Signed-off-by: Steffen Klassert

    Horia Geanta
     

16 Dec, 2013

1 commit


06 Dec, 2013

3 commits

  • We now queue packets to the policy if the states are not yet resolved,
    this replaces the ancient sleeping code. Also the sleeping can cause
    indefinite task hangs if the needed state does not get resolved.

    Signed-off-by: Steffen Klassert

    Steffen Klassert
     
  • By semantics, xfrm layer is fully name space aware,
    so will the locks, e.g. xfrm_state/pocliy_lock.
    Ensure exclusive access into state/policy link list
    for different name space with one global lock is not
    right in terms of semantics aspect at first place,
    as they are indeed mutually independent with each
    other, but also more seriously causes scalability
    problem.

    One practical scenario is on a Open Network Stack,
    more than hundreds of lxc tenants acts as routers
    within one host, a global xfrm_state/policy_lock
    becomes the bottleneck. But onces those locks are
    decoupled in a per-namespace fashion, locks contend
    is just with in specific name space scope, without
    causing additional SPD/SAD access delay for other
    name space.

    Also this patch improve scalability while as without
    changing original xfrm behavior.

    Signed-off-by: Fan Du
    Signed-off-by: Steffen Klassert

    Fan Du
     
  • because the home agent could surely be run on a different
    net namespace other than init_net. The original behavior
    could lead into inconsistent of key info.

    Signed-off-by: Fan Du
    Signed-off-by: Steffen Klassert

    Fan Du
     

21 Nov, 2013

1 commit


17 Sep, 2013

1 commit

  • For legacy IPsec anti replay mechanism:

    bitmap in struct xfrm_replay_state could only provide a 32 bits
    window size limit in current design, thus user level parameter
    sadb_sa_replay should honor this limit, otherwise misleading
    outputs("replay=244") by setkey -D will be:

    192.168.25.2 192.168.22.2
    esp mode=transport spi=147561170(0x08cb9ad2) reqid=0(0x00000000)
    E: aes-cbc 9a8d7468 7655cf0b 719d27be b0ddaac2
    A: hmac-sha1 2d2115c2 ebf7c126 1c54f186 3b139b58 264a7331
    seq=0x00000000 replay=244 flags=0x00000000 state=mature
    created: Sep 17 14:00:00 2013 current: Sep 17 14:00:22 2013
    diff: 22(s) hard: 30(s) soft: 26(s)
    last: Sep 17 14:00:00 2013 hard: 0(s) soft: 0(s)
    current: 1408(bytes) hard: 0(bytes) soft: 0(bytes)
    allocated: 22 hard: 0 soft: 0
    sadb_seq=1 pid=4854 refcnt=0
    192.168.22.2 192.168.25.2
    esp mode=transport spi=255302123(0x0f3799eb) reqid=0(0x00000000)
    E: aes-cbc 6485d990 f61a6bd5 e5660252 608ad282
    A: hmac-sha1 0cca811a eb4fa893 c47ae56c 98f6e413 87379a88
    seq=0x00000000 replay=244 flags=0x00000000 state=mature
    created: Sep 17 14:00:00 2013 current: Sep 17 14:00:22 2013
    diff: 22(s) hard: 30(s) soft: 26(s)
    last: Sep 17 14:00:00 2013 hard: 0(s) soft: 0(s)
    current: 1408(bytes) hard: 0(bytes) soft: 0(bytes)
    allocated: 22 hard: 0 soft: 0
    sadb_seq=0 pid=4854 refcnt=0

    And also, optimizing xfrm_replay_check window checking by setting the
    desirable x->props.replay_window with only doing the comparison once
    for all when xfrm_state is first born.

    Signed-off-by: Fan Du
    Signed-off-by: Steffen Klassert

    Fan Du
     

07 Aug, 2013

1 commit

  • present_and_same_family has checked addresses family validness for both
    SADB_EXT_ADDRESS_SRC and SADB_EXT_ADDRESS_DST in the beginning.
    Thereafter pfkey_sadb_addr2xfrm_addr doesn't need to do the checking again.

    Signed-off-by: Fan Du
    Signed-off-by: Steffen Klassert

    Fan Du
     

05 Aug, 2013

2 commits


31 Jul, 2013

1 commit

  • This is inspired by a5cc68f3d6 "af_key: fix info leaks in notify
    messages". There are some struct members which don't get initialized
    and could disclose small amounts of private information.

    Acked-by: Mathias Krause
    Signed-off-by: Dan Carpenter
    Acked-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Dan Carpenter
     

27 Jun, 2013

1 commit

  • key_notify_sa_flush() and key_notify_policy_flush() miss to initialize
    the sadb_msg_reserved member of the broadcasted message and thereby
    leak 2 bytes of heap memory to listeners. Fix that.

    Signed-off-by: Mathias Krause
    Cc: Steffen Klassert
    Cc: "David S. Miller"
    Cc: Herbert Xu
    Signed-off-by: David S. Miller

    Mathias Krause
     

01 Jun, 2013

1 commit

  • In some cases after deleting a policy from the SPD the policy would
    remain in the dst/flow/route cache for an extended period of time
    which caused problems for SELinux as its dynamic network access
    controls key off of the number of XFRM policy and state entries.
    This patch corrects this problem by forcing a XFRM garbage collection
    whenever a policy is sucessfully removed.

    Reported-by: Ondrej Moris
    Signed-off-by: Paul Moore
    Signed-off-by: David S. Miller

    Paul Moore
     

28 Mar, 2013

1 commit

  • Steffen Klassert says:

    ====================
    1) Initialize the satype field in key_notify_policy_flush(),
    this was left uninitialized. From Nicolas Dichtel.

    2) The sequence number difference for replay notifications
    was misscalculated on ESN sequence number wrap. We need
    a separate replay notify function for esn.

    3) Fix an off by one in the esn replay notify function.
    From Mathias Krause.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

08 Mar, 2013

1 commit


28 Feb, 2013

1 commit

  • I'm not sure why, but the hlist for each entry iterators were conceived

    list_for_each_entry(pos, head, member)

    The hlist ones were greedy and wanted an extra parameter:

    hlist_for_each_entry(tpos, pos, head, member)

    Why did they need an extra pos parameter? I'm not quite sure. Not only
    they don't really need it, it also prevents the iterator from looking
    exactly like the list iterator, which is unfortunate.

    Besides the semantic patch, there was some manual work required:

    - Fix up the actual hlist iterators in linux/list.h
    - Fix up the declaration of other iterators based on the hlist ones.
    - A very small amount of places were using the 'node' parameter, this
    was modified to use 'obj->member' instead.
    - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
    properly, so those had to be fixed up manually.

    The semantic patch which is mostly the work of Peter Senna Tschudin is here:

    @@
    iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

    type T;
    expression a,c,d,e;
    identifier b;
    statement S;
    @@

    -T b;

    [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
    [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
    [akpm@linux-foundation.org: checkpatch fixes]
    [akpm@linux-foundation.org: fix warnings]
    [akpm@linux-foudnation.org: redo intrusive kvm changes]
    Tested-by: Peter Senna Tschudin
    Acked-by: Paul E. McKenney
    Signed-off-by: Sasha Levin
    Cc: Wu Fengguang
    Cc: Marcelo Tosatti
    Cc: Gleb Natapov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     

21 Feb, 2013

1 commit


19 Feb, 2013

2 commits

  • proc_net_remove is only used to remove proc entries
    that under /proc/net,it's not a general function for
    removing proc entries of netns. if we want to remove
    some proc entries which under /proc/net/stat/, we still
    need to call remove_proc_entry.

    this patch use remove_proc_entry to replace proc_net_remove.
    we can remove proc_net_remove after this patch.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     
  • Right now, some modules such as bonding use proc_create
    to create proc entries under /proc/net/, and other modules
    such as ipv4 use proc_net_fops_create.

    It looks a little chaos.this patch changes all of
    proc_net_fops_create to proc_create. we can remove
    proc_net_fops_create after this patch.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     

15 Feb, 2013

1 commit

  • Steffen Klassert says:

    ====================
    1) Remove a duplicated call to skb_orphan() in pf_key, from Cong Wang.

    2) Prepare xfrm and pf_key for algorithms without pf_key support,
    from Jussi Kivilinna.

    3) Fix an unbalanced lock in xfrm_output_one(), from Li RongQing.

    4) Add an IPsec state resolution packet queue to handle
    packets that are send before the states are resolved.

    5) xfrm4_policy_fini() is unused since 2.6.11, time to remove it.
    From Michal Kubecek.

    6) The xfrm gc threshold was configurable just in the initial
    namespace, make it configurable in all namespaces. From
    Michal Kubecek.

    7) We currently can not insert policies with mark and mask
    such that some flows would be matched from both policies.
    Allow this if the priorities of these policies are different,
    the one with the higher priority is used in this case.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller