11 May, 2020

1 commit

  • After recent change 'x' is only used when CONFIG_NETFILTER is set:

    net/ipv4/xfrm4_output.c: In function '__xfrm4_output':
    net/ipv4/xfrm4_output.c:19:21: warning: unused variable 'x' [-Wunused-variable]
    19 | struct xfrm_state *x = skb_dst(skb)->xfrm;

    Expand the CONFIG_NETFILTER scope to avoid this.

    Fixes: 2ab6096db2f1 ("xfrm: remove output_finish indirection from xfrm_state_afinfo")
    Reported-by: Stephen Rothwell
    Signed-off-by: Florian Westphal
    Signed-off-by: Steffen Klassert

    Florian Westphal
     

06 May, 2020

2 commits

  • There are only two implementaions, one for ipv4 and one for ipv6.

    Both are almost identical, they clear skb->cb[], set the TRANSFORMED flag
    in IP(6)CB and then call the common xfrm_output() function.

    By placing the IPCB handling into the common function, we avoid the need
    for the output_finish indirection as the output functions can simply
    use xfrm_output().

    Signed-off-by: Florian Westphal
    Signed-off-by: Steffen Klassert

    Florian Westphal
     
  • We can use a direct call for ipv4, so move the needed functions
    to net/xfrm/xfrm_output.c and call them directly.

    For ipv6 the indirection can be avoided as well but it will need
    a bit more work -- to ease review it will be done in another patch.

    Signed-off-by: Florian Westphal
    Signed-off-by: Steffen Klassert

    Florian Westphal
     

16 Nov, 2019

1 commit

  • Instead of generally passing NULL to NF_HOOK_COND() for input device,
    pass skb->dev which contains input device for routed skbs.

    Note that iptables (both legacy and nft) reject rules with input
    interface match from being added to POSTROUTING chains, but nftables
    allows this.

    Cc: Eric Garver
    Signed-off-by: Phil Sutter
    Signed-off-by: Pablo Neira Ayuso

    Phil Sutter
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

08 Apr, 2019

3 commits

  • This structure is now only 4 bytes, so its more efficient
    to cache a copy rather than its address.

    No significant size difference in allmodconfig vmlinux.

    With non-modular kernel that has all XFRM options enabled, this
    series reduces vmlinux image size by ~11kb. All xfrm_mode
    indirections are gone and all modes are built-in.

    before (ipsec-next master):
    text data bss dec filename
    21071494 7233140 11104324 39408958 vmlinux.master

    after this series:
    21066448 7226772 11104324 39397544 vmlinux.patched

    With allmodconfig kernel, the size increase is only 362 bytes,
    even all the xfrm config options removed in this series are
    modular.

    before:
    text data bss dec filename
    15731286 6936912 4046908 26715106 vmlinux.master

    after this series:
    15731492 6937068 4046908 26715468 vmlinux

    Signed-off-by: Florian Westphal
    Reviewed-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Florian Westphal
     
  • Adds an EXPORT_SYMBOL for afinfo_get_rcu, as it will now be called from
    ipv6 in case of CONFIG_IPV6=m.

    This change has virtually no effect on vmlinux size, but it reduces
    afinfo size and allows followup patch to make xfrm modes const.

    v2: mark if (afinfo) tests as likely (Sabrina)
    re-fetch afinfo according to inner_mode in xfrm_prepare_input().

    Signed-off-by: Florian Westphal
    Reviewed-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Florian Westphal
     
  • Same is input indirection. Only exception: we need to export
    xfrm_outer_mode_output for pktgen.

    Increases size of vmlinux by about 163 byte:
    Before:
    text data bss dec filename
    15730208 6936948 4046908 26714064 vmlinux

    After:
    15730311 6937008 4046908 26714227 vmlinux

    xfrm_inner_extract_output has no more external callers, make it static.

    v2: add IS_ENABLED(IPV6) guard in xfrm6_prepare_output
    add two missing breaks in xfrm_outer_mode_output (Sabrina Dubroca)
    add WARN_ON_ONCE for 'call AF_INET6 related output function, but
    CONFIG_IPV6=n' case.
    make xfrm_inner_extract_output static

    Signed-off-by: Florian Westphal
    Reviewed-by: Sabrina Dubroca
    Signed-off-by: Steffen Klassert

    Florian Westphal
     

05 Mar, 2018

1 commit


14 Apr, 2017

1 commit

  • This patch adds all the bits that are needed to do
    IPsec hardware offload for IPsec states and ESP packets.
    We add xfrmdev_ops to the net_device. xfrmdev_ops has
    function pointers that are needed to manage the xfrm
    states in the hardware and to do a per packet
    offloading decision.

    Joint work with:
    Ilan Tayari
    Guy Shapiro
    Yossi Kuperman

    Signed-off-by: Guy Shapiro
    Signed-off-by: Ilan Tayari
    Signed-off-by: Yossi Kuperman
    Signed-off-by: Steffen Klassert

    Steffen Klassert
     

24 Oct, 2015

1 commit

  • Conflicts:
    net/ipv6/xfrm6_output.c
    net/openvswitch/flow_netlink.c
    net/openvswitch/vport-gre.c
    net/openvswitch/vport-vxlan.c
    net/openvswitch/vport.c
    net/openvswitch/vport.h

    The openvswitch conflicts were overlapping changes. One was
    the egress tunnel info fix in 'net' and the other was the
    vport ->send() op simplification in 'net-next'.

    The xfrm6_output.c conflicts was also a simplification
    overlapping a bug fix.

    Signed-off-by: David S. Miller

    David S. Miller
     

19 Oct, 2015

1 commit

  • Commit 044a832a777 ("xfrm: Fix local error reporting crash
    with interfamily tunnels") moved the setting of skb->protocol
    behind the last access of the inner mode family to fix an
    interfamily crash. Unfortunately now skb->protocol might not
    be set at all, so we fail dispatch to the inner address family.
    As a reault, the local error handler is not called and the
    mtu value is not reported back to userspace.

    We fix this by setting skb->protocol on message size errors
    before we call xfrm_local_error.

    Fixes: 044a832a7779c ("xfrm: Fix local error reporting crash with interfamily tunnels")
    Signed-off-by: Steffen Klassert

    Steffen Klassert
     

08 Oct, 2015

2 commits


18 Sep, 2015

4 commits

  • In code review it was noticed that I had failed to add some blank lines
    in places where they are customarily used. Taking a second look at the
    code I have to agree blank lines would be nice so I have added them
    here.

    Reported-by: Nicolas Dichtel
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This is immediately motivated by the bridge code that chains functions that
    call into netfilter. Without passing net into the okfns the bridge code would
    need to guess about the best expression for the network namespace to process
    packets in.

    As net is frequently one of the first things computed in continuation functions
    after netfilter has done it's job passing in the desired network namespace is in
    many cases a code simplification.

    To support this change the function dst_output_okfn is introduced to
    simplify passing dst_output as an okfn. For the moment dst_output_okfn
    just silently drops the struct net.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Pass a network namespace parameter into the netfilter hooks. At the
    call site of the netfilter hooks the path a packet is taking through
    the network stack is well known which allows the network namespace to
    be easily and reliabily.

    This allows the replacement of magic code like
    "dev_net(state->in?:state->out)" that appears at the start of most
    netfilter hooks with "state->net".

    In almost all cases the network namespace passed in is derived
    from the first network device passed in, guaranteeing those
    paths will not see any changes in practice.

    The exceptions are:
    xfrm/xfrm_output.c:xfrm_output_resume() xs_net(skb_dst(skb)->xfrm)
    ipvs/ip_vs_xmit.c:ip_vs_nat_send_or_cont() ip_vs_conn_net(cp)
    ipvs/ip_vs_xmit.c:ip_vs_send_or_cont() ip_vs_conn_net(cp)
    ipv4/raw.c:raw_send_hdrinc() sock_net(sk)
    ipv6/ip6_output.c:ip6_xmit() sock_net(sk)
    ipv6/ndisc.c:ndisc_send_skb() dev_net(skb->dev) not dev_net(dst->dev)
    ipv6/raw.c:raw6_send_hdrinc() sock_net(sk)
    br_netfilter_hooks.c:br_nf_pre_routing_finish() dev_net(skb->dev) before skb->dev is set to nf_bridge->physindev

    In all cases these exceptions seem to be a better expression for the
    network namespace the packet is being processed in then the historic
    "dev_net(in?in:out)". I am documenting them in case something odd
    pops up and someone starts trying to track down what happened.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Add a sock paramter to dst_output making dst_output_sk superfluous.
    Add a skb->sk parameter to all of the callers of dst_output
    Have the callers of dst_output_sk call dst_output.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

08 Apr, 2015

1 commit

  • On the output paths in particular, we have to sometimes deal with two
    socket contexts. First, and usually skb->sk, is the local socket that
    generated the frame.

    And second, is potentially the socket used to control a tunneling
    socket, such as one the encapsulates using UDP.

    We do not want to disassociate skb->sk when encapsulating in order
    to fix this, because that would break socket memory accounting.

    The most extreme case where this can cause huge problems is an
    AF_PACKET socket transmitting over a vxlan device. We hit code
    paths doing checks that assume they are dealing with an ipv4
    socket, but are actually operating upon the AF_PACKET one.

    Signed-off-by: David S. Miller

    David Miller
     

09 Feb, 2015

1 commit

  • We set the outer mode protocol too early. As a result, the
    local error handler might dispatch to the wrong address family
    and report the error to a wrong socket type. We fix this by
    setting the outer protocol to the skb after we accessed the
    inner mode for the last time, right before we do the atcual
    encapsulation where we switch finally to the outer mode.

    Reported-by: Chris Ruehl
    Tested-by: Chris Ruehl
    Signed-off-by: Steffen Klassert

    Steffen Klassert
     

24 May, 2014

1 commit

  • Conflicts:
    drivers/net/bonding/bond_alb.c
    drivers/net/ethernet/altera/altera_msgdma.c
    drivers/net/ethernet/altera/altera_sgdma.c
    net/ipv6/xfrm6_output.c

    Several cases of overlapping changes.

    The xfrm6_output.c has a bug fix which overlaps the renaming
    of skb->local_df to skb->ignore_df.

    In the Altera TSE driver cases, the register access cleanups
    in net-next overlapped with bug fixes done in net.

    Similarly a bug fix to send ALB packets in the bonding driver using
    the right source address overlaps with cleanups in net-next.

    Signed-off-by: David S. Miller

    David S. Miller
     

16 May, 2014

1 commit

  • Conflicts:
    net/ipv4/ip_vti.c

    Steffen Klassert says:

    ====================
    pull request (net): ipsec 2014-05-15

    This pull request has a merge conflict in net/ipv4/ip_vti.c
    between commit 8d89dcdf80d8 ("vti: don't allow to add the same
    tunnel twice") and commit a32452366b72 ("vti4:Don't count header
    length twice"). It can be solved like it is done in linux-next.

    1) Fix a ipv6 xfrm output crash when a packet is rerouted
    by netfilter to not use IPsec.

    2) vti4 counts some header lengths twice leading to an incorrect
    device mtu. Fix this by counting these headers only once.

    3) We don't catch the case if an unsupported protocol is submitted
    to the xfrm protocol handlers, this can lead to NULL pointer
    dereferences. Fix this by adding the appropriate checks.

    4) vti6 may unregister pernet ops twice on init errors.
    Fix this by removing one of the calls to do it only once.
    From Mathias Krause.

    5) Set the vti tunnel mark before doing a lookup in the error
    handlers. Otherwise we don't find the correct xfrm state.
    ====================

    The conflict in ip_vti.c was simple, 'net' had a commit
    removing a line from vti_tunnel_init() and this tree
    being merged had a commit adding a line to the same
    location.

    Signed-off-by: David S. Miller

    David S. Miller
     

13 May, 2014

1 commit

  • As suggested by several people, rename local_df to ignore_df,
    since it means "ignore df bit if it is set".

    Cc: Maciej Żenczykowski
    Cc: Florian Westphal
    Cc: David S. Miller
    Cc: Eric Dumazet
    Signed-off-by: Cong Wang
    Acked-by: Maciej Żenczykowski
    Signed-off-by: David S. Miller

    WANG Cong
     

16 Apr, 2014

1 commit

  • In the dst->output() path for ipv4, the code assumes the skb it has to
    transmit is attached to an inet socket, specifically via
    ip_mc_output() : The sk_mc_loop() test triggers a WARN_ON() when the
    provider of the packet is an AF_PACKET socket.

    The dst->output() method gets an additional 'struct sock *sk'
    parameter. This needs a cascade of changes so that this parameter can
    be propagated from vxlan to final consumer.

    Fixes: 8f646c922d55 ("vxlan: keep original skb ownership")
    Reported-by: lucien xin
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Apr, 2014

1 commit

  • The ipv6 xfrm output path is not aware that packets can be
    rerouted by NAT to not use IPsec. We crash in this case
    because we expect to have a xfrm state at the dst_entry.
    This crash happens if the ipv6 layer does IPsec and NAT
    or if we have an interfamily IPsec tunnel with ipv4 NAT.

    We fix this by checking for a NAT rerouted packet in each
    address family and dst_output() to the new destination
    in this case.

    Reported-by: Martin Pelikan
    Tested-by: Martin Pelikan
    Signed-off-by: Steffen Klassert

    Steffen Klassert
     

26 Aug, 2013

1 commit

  • In commit 0ea9d5e3e0e03a63b11392f5613378977dae7eca ("xfrm: introduce
    helper for safe determination of mtu") I switched the determination of
    ipv4 mtus from dst_mtu to ip_skb_dst_mtu. This was an error because in
    case of IP_PMTUDISC_PROBE we fall back to the interface mtu, which is
    never correct for ipv4 ipsec.

    This patch partly reverts 0ea9d5e3e0e03a63b11392f5613378977dae7eca
    ("xfrm: introduce helper for safe determination of mtu").

    Cc: Steffen Klassert
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: Steffen Klassert

    Hannes Frederic Sowa
     

14 Aug, 2013

2 commits

  • skb->sk socket can be of AF_INET or AF_INET6 address family. Thus we
    always have to make sure we a referring to the correct interpretation
    of skb->sk.

    We only depend on header defines to query the mtu, so we don't introduce
    a new dependency to ipv6 by this change.

    Cc: Steffen Klassert
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: Steffen Klassert

    Hannes Frederic Sowa
     
  • In xfrm4 and xfrm6 we need to take care about sockets of the other
    address family. This could happen because a 6in4 or 4in6 tunnel could
    get protected by ipsec.

    Because we don't want to have a run-time dependency on ipv6 when only
    using ipv4 xfrm we have to embed a pointer to the correct local_error
    function in xfrm_state_afinet and look it up when returning an error
    depending on the socket address family.

    Thanks to vi0ss for the great bug report:

    v2:
    a) fix two more unsafe interpretations of skb->sk as ipv6 socket
    (xfrm6_local_dontfrag and __xfrm6_output)
    v3:
    a) add an EXPORT_SYMBOL_GPL(xfrm_local_error) to fix a link error when
    building ipv6 as a module (thanks to Steffen Klassert)

    Reported-by:
    Cc: Steffen Klassert
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: Steffen Klassert

    Hannes Frederic Sowa
     

02 Jul, 2011

1 commit


11 May, 2011

1 commit

  • As it is, we assign the outer modes output function to the dst entry
    when we create the xfrm bundle. This leads to two problems on interfamily
    scenarios. We might insert ipv4 packets into ip6_fragment when called
    from xfrm6_output. The system crashes if we try to fragment an ipv4
    packet with ip6_fragment. This issue was introduced with git commit
    ad0081e4 (ipv6: Fragment locally generated tunnel-mode IPSec6 packets
    as needed). The second issue is, that we might insert ipv4 packets in
    netfilter6 and vice versa on interfamily scenarios.

    With this patch we assign the inner mode output function to the dst entry
    when we create the xfrm bundle. So xfrm4_output/xfrm6_output from the inner
    mode is used and the right fragmentation and netfilter functions are called.
    We switch then to outer mode with the output_finish functions.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     

25 Mar, 2010

1 commit


03 Jun, 2009

1 commit

  • Define three accessors to get/set dst attached to a skb

    struct dst_entry *skb_dst(const struct sk_buff *skb)

    void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

    void skb_dst_drop(struct sk_buff *skb)
    This one should replace occurrences of :
    dst_release(skb->dst)
    skb->dst = NULL;

    Delete skb->dst field

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

25 Mar, 2008

1 commit


29 Jan, 2008

5 commits

  • The IPv4 and IPv6 hook values are identical, yet some code tries to figure
    out the "correct" value by looking at the address family. Introduce NF_INET_*
    values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__
    section for userspace compatibility.

    Signed-off-by: Patrick McHardy
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • The nhoff field isn't actually necessary in xfrm_input. For tunnel
    mode transforms we now throw away the output IP header so it makes no
    sense to fill in the nexthdr field. For transport mode we can now let
    the function transport_finish do the setting and it knows where the
    nexthdr field is.

    The only other thing that needs the nexthdr field to be set is the
    header extraction code. However, we can simply move the protocol
    extraction out of the generic header extraction.

    We want to minimise the amount of info we have to carry around between
    transforms as this simplifies the resumption process for async crypto.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • As part of the work on asynchrnous cryptographic operations, we need
    to be able to resume from the spot where they occur. As such, it
    helps if we isolate them to one spot.

    This patch moves most of the remaining family-specific processing into
    the common output code.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Most callers of the LOCAL_OUT chain will set the IP packet length and
    header checksum before doing so. They also share the same output
    function dst_output.

    This patch creates a new function called ip_local_out which does all
    of that and converts the appropriate users over to it.

    Apart from removing duplicate code, it will also help in merging the
    IPsec output path once the same thing is done for IPv6.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • With inter-family transforms the inner mode differs from the outer
    mode. Attempting to handle both sides from the same function means
    that it needs to handle both IPv4 and IPv6 which creates duplication
    and confusion.

    This patch separates the two parts on the output path so that each
    function deals with one family only.

    In particular, the functions xfrm4_extract_output/xfrm6_extract_output
    moves the pertinent fields from the IPv4/IPv6 IP headers into a
    neutral format stored in skb->cb. This is then used by the outer mode
    output functions to write the outer IP header. In this way the output
    function no longer has to know about the inner address family.

    Since the extract functions are only called by tunnel modes (the only
    modes that can support inter-family transforms), I've also moved the
    xfrm*_tunnel_check_size calls into them. This allows the correct ICMP
    message to be sent as opposed to now where you might call icmp_send
    with an IPv6 packet and vice versa.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

18 Oct, 2007

2 commits

  • This patch adds a new field to xfrm states called inner_mode. The existing
    mode object is renamed to outer_mode.

    This is the first part of an attempt to fix inter-family transforms. As it
    is we always use the outer family when determining which mode to use. As a
    result we may end up shoving IPv4 packets into netfilter6 and vice versa.

    What we really want is to use the inner family for the first part of outbound
    processing and the outer family for the second part. For inbound processing
    we'd use the opposite pairing.

    I've also added a check to prevent silly combinations such as transport mode
    with inter-family transforms.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Currently BEET mode does not reinject the packet back into the stack
    like tunnel mode does. Since BEET should behave just like tunnel mode
    this is incorrect.

    This patch fixes this by introducing a flags field to xfrm_mode that
    tells the IPsec code whether it should terminate and reinject the packet
    back into the stack.

    It then sets the flag for BEET and tunnel mode.

    I've also added a number of missing BEET checks elsewhere where we check
    whether a given mode is a tunnel or not.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu