14 Apr, 2017

1 commit

  • This patch adds all the bits that are needed to do
    IPsec hardware offload for IPsec states and ESP packets.
    We add xfrmdev_ops to the net_device. xfrmdev_ops has
    function pointers that are needed to manage the xfrm
    states in the hardware and to do a per packet
    offloading decision.

    Joint work with:
    Ilan Tayari
    Guy Shapiro
    Yossi Kuperman

    Signed-off-by: Guy Shapiro
    Signed-off-by: Ilan Tayari
    Signed-off-by: Yossi Kuperman
    Signed-off-by: Steffen Klassert

    Steffen Klassert
     

24 Oct, 2015

1 commit

  • Conflicts:
    net/ipv6/xfrm6_output.c
    net/openvswitch/flow_netlink.c
    net/openvswitch/vport-gre.c
    net/openvswitch/vport-vxlan.c
    net/openvswitch/vport.c
    net/openvswitch/vport.h

    The openvswitch conflicts were overlapping changes. One was
    the egress tunnel info fix in 'net' and the other was the
    vport ->send() op simplification in 'net-next'.

    The xfrm6_output.c conflicts was also a simplification
    overlapping a bug fix.

    Signed-off-by: David S. Miller

    David S. Miller
     

19 Oct, 2015

1 commit

  • Commit 044a832a777 ("xfrm: Fix local error reporting crash
    with interfamily tunnels") moved the setting of skb->protocol
    behind the last access of the inner mode family to fix an
    interfamily crash. Unfortunately now skb->protocol might not
    be set at all, so we fail dispatch to the inner address family.
    As a reault, the local error handler is not called and the
    mtu value is not reported back to userspace.

    We fix this by setting skb->protocol on message size errors
    before we call xfrm_local_error.

    Fixes: 044a832a7779c ("xfrm: Fix local error reporting crash with interfamily tunnels")
    Signed-off-by: Steffen Klassert

    Steffen Klassert
     

08 Oct, 2015

2 commits


18 Sep, 2015

4 commits

  • In code review it was noticed that I had failed to add some blank lines
    in places where they are customarily used. Taking a second look at the
    code I have to agree blank lines would be nice so I have added them
    here.

    Reported-by: Nicolas Dichtel
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This is immediately motivated by the bridge code that chains functions that
    call into netfilter. Without passing net into the okfns the bridge code would
    need to guess about the best expression for the network namespace to process
    packets in.

    As net is frequently one of the first things computed in continuation functions
    after netfilter has done it's job passing in the desired network namespace is in
    many cases a code simplification.

    To support this change the function dst_output_okfn is introduced to
    simplify passing dst_output as an okfn. For the moment dst_output_okfn
    just silently drops the struct net.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Pass a network namespace parameter into the netfilter hooks. At the
    call site of the netfilter hooks the path a packet is taking through
    the network stack is well known which allows the network namespace to
    be easily and reliabily.

    This allows the replacement of magic code like
    "dev_net(state->in?:state->out)" that appears at the start of most
    netfilter hooks with "state->net".

    In almost all cases the network namespace passed in is derived
    from the first network device passed in, guaranteeing those
    paths will not see any changes in practice.

    The exceptions are:
    xfrm/xfrm_output.c:xfrm_output_resume() xs_net(skb_dst(skb)->xfrm)
    ipvs/ip_vs_xmit.c:ip_vs_nat_send_or_cont() ip_vs_conn_net(cp)
    ipvs/ip_vs_xmit.c:ip_vs_send_or_cont() ip_vs_conn_net(cp)
    ipv4/raw.c:raw_send_hdrinc() sock_net(sk)
    ipv6/ip6_output.c:ip6_xmit() sock_net(sk)
    ipv6/ndisc.c:ndisc_send_skb() dev_net(skb->dev) not dev_net(dst->dev)
    ipv6/raw.c:raw6_send_hdrinc() sock_net(sk)
    br_netfilter_hooks.c:br_nf_pre_routing_finish() dev_net(skb->dev) before skb->dev is set to nf_bridge->physindev

    In all cases these exceptions seem to be a better expression for the
    network namespace the packet is being processed in then the historic
    "dev_net(in?in:out)". I am documenting them in case something odd
    pops up and someone starts trying to track down what happened.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Add a sock paramter to dst_output making dst_output_sk superfluous.
    Add a skb->sk parameter to all of the callers of dst_output
    Have the callers of dst_output_sk call dst_output.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

08 Apr, 2015

1 commit

  • On the output paths in particular, we have to sometimes deal with two
    socket contexts. First, and usually skb->sk, is the local socket that
    generated the frame.

    And second, is potentially the socket used to control a tunneling
    socket, such as one the encapsulates using UDP.

    We do not want to disassociate skb->sk when encapsulating in order
    to fix this, because that would break socket memory accounting.

    The most extreme case where this can cause huge problems is an
    AF_PACKET socket transmitting over a vxlan device. We hit code
    paths doing checks that assume they are dealing with an ipv4
    socket, but are actually operating upon the AF_PACKET one.

    Signed-off-by: David S. Miller

    David Miller
     

09 Feb, 2015

1 commit

  • We set the outer mode protocol too early. As a result, the
    local error handler might dispatch to the wrong address family
    and report the error to a wrong socket type. We fix this by
    setting the outer protocol to the skb after we accessed the
    inner mode for the last time, right before we do the atcual
    encapsulation where we switch finally to the outer mode.

    Reported-by: Chris Ruehl
    Tested-by: Chris Ruehl
    Signed-off-by: Steffen Klassert

    Steffen Klassert
     

24 May, 2014

1 commit

  • Conflicts:
    drivers/net/bonding/bond_alb.c
    drivers/net/ethernet/altera/altera_msgdma.c
    drivers/net/ethernet/altera/altera_sgdma.c
    net/ipv6/xfrm6_output.c

    Several cases of overlapping changes.

    The xfrm6_output.c has a bug fix which overlaps the renaming
    of skb->local_df to skb->ignore_df.

    In the Altera TSE driver cases, the register access cleanups
    in net-next overlapped with bug fixes done in net.

    Similarly a bug fix to send ALB packets in the bonding driver using
    the right source address overlaps with cleanups in net-next.

    Signed-off-by: David S. Miller

    David S. Miller
     

16 May, 2014

1 commit

  • Conflicts:
    net/ipv4/ip_vti.c

    Steffen Klassert says:

    ====================
    pull request (net): ipsec 2014-05-15

    This pull request has a merge conflict in net/ipv4/ip_vti.c
    between commit 8d89dcdf80d8 ("vti: don't allow to add the same
    tunnel twice") and commit a32452366b72 ("vti4:Don't count header
    length twice"). It can be solved like it is done in linux-next.

    1) Fix a ipv6 xfrm output crash when a packet is rerouted
    by netfilter to not use IPsec.

    2) vti4 counts some header lengths twice leading to an incorrect
    device mtu. Fix this by counting these headers only once.

    3) We don't catch the case if an unsupported protocol is submitted
    to the xfrm protocol handlers, this can lead to NULL pointer
    dereferences. Fix this by adding the appropriate checks.

    4) vti6 may unregister pernet ops twice on init errors.
    Fix this by removing one of the calls to do it only once.
    From Mathias Krause.

    5) Set the vti tunnel mark before doing a lookup in the error
    handlers. Otherwise we don't find the correct xfrm state.
    ====================

    The conflict in ip_vti.c was simple, 'net' had a commit
    removing a line from vti_tunnel_init() and this tree
    being merged had a commit adding a line to the same
    location.

    Signed-off-by: David S. Miller

    David S. Miller
     

13 May, 2014

1 commit

  • As suggested by several people, rename local_df to ignore_df,
    since it means "ignore df bit if it is set".

    Cc: Maciej Żenczykowski
    Cc: Florian Westphal
    Cc: David S. Miller
    Cc: Eric Dumazet
    Signed-off-by: Cong Wang
    Acked-by: Maciej Żenczykowski
    Signed-off-by: David S. Miller

    WANG Cong
     

16 Apr, 2014

1 commit

  • In the dst->output() path for ipv4, the code assumes the skb it has to
    transmit is attached to an inet socket, specifically via
    ip_mc_output() : The sk_mc_loop() test triggers a WARN_ON() when the
    provider of the packet is an AF_PACKET socket.

    The dst->output() method gets an additional 'struct sock *sk'
    parameter. This needs a cascade of changes so that this parameter can
    be propagated from vxlan to final consumer.

    Fixes: 8f646c922d55 ("vxlan: keep original skb ownership")
    Reported-by: lucien xin
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Apr, 2014

1 commit

  • The ipv6 xfrm output path is not aware that packets can be
    rerouted by NAT to not use IPsec. We crash in this case
    because we expect to have a xfrm state at the dst_entry.
    This crash happens if the ipv6 layer does IPsec and NAT
    or if we have an interfamily IPsec tunnel with ipv4 NAT.

    We fix this by checking for a NAT rerouted packet in each
    address family and dst_output() to the new destination
    in this case.

    Reported-by: Martin Pelikan
    Tested-by: Martin Pelikan
    Signed-off-by: Steffen Klassert

    Steffen Klassert
     

26 Aug, 2013

1 commit

  • In commit 0ea9d5e3e0e03a63b11392f5613378977dae7eca ("xfrm: introduce
    helper for safe determination of mtu") I switched the determination of
    ipv4 mtus from dst_mtu to ip_skb_dst_mtu. This was an error because in
    case of IP_PMTUDISC_PROBE we fall back to the interface mtu, which is
    never correct for ipv4 ipsec.

    This patch partly reverts 0ea9d5e3e0e03a63b11392f5613378977dae7eca
    ("xfrm: introduce helper for safe determination of mtu").

    Cc: Steffen Klassert
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: Steffen Klassert

    Hannes Frederic Sowa
     

14 Aug, 2013

2 commits

  • skb->sk socket can be of AF_INET or AF_INET6 address family. Thus we
    always have to make sure we a referring to the correct interpretation
    of skb->sk.

    We only depend on header defines to query the mtu, so we don't introduce
    a new dependency to ipv6 by this change.

    Cc: Steffen Klassert
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: Steffen Klassert

    Hannes Frederic Sowa
     
  • In xfrm4 and xfrm6 we need to take care about sockets of the other
    address family. This could happen because a 6in4 or 4in6 tunnel could
    get protected by ipsec.

    Because we don't want to have a run-time dependency on ipv6 when only
    using ipv4 xfrm we have to embed a pointer to the correct local_error
    function in xfrm_state_afinet and look it up when returning an error
    depending on the socket address family.

    Thanks to vi0ss for the great bug report:

    v2:
    a) fix two more unsafe interpretations of skb->sk as ipv6 socket
    (xfrm6_local_dontfrag and __xfrm6_output)
    v3:
    a) add an EXPORT_SYMBOL_GPL(xfrm_local_error) to fix a link error when
    building ipv6 as a module (thanks to Steffen Klassert)

    Reported-by:
    Cc: Steffen Klassert
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: Steffen Klassert

    Hannes Frederic Sowa
     

02 Jul, 2011

1 commit


11 May, 2011

1 commit

  • As it is, we assign the outer modes output function to the dst entry
    when we create the xfrm bundle. This leads to two problems on interfamily
    scenarios. We might insert ipv4 packets into ip6_fragment when called
    from xfrm6_output. The system crashes if we try to fragment an ipv4
    packet with ip6_fragment. This issue was introduced with git commit
    ad0081e4 (ipv6: Fragment locally generated tunnel-mode IPSec6 packets
    as needed). The second issue is, that we might insert ipv4 packets in
    netfilter6 and vice versa on interfamily scenarios.

    With this patch we assign the inner mode output function to the dst entry
    when we create the xfrm bundle. So xfrm4_output/xfrm6_output from the inner
    mode is used and the right fragmentation and netfilter functions are called.
    We switch then to outer mode with the output_finish functions.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     

25 Mar, 2010

1 commit


03 Jun, 2009

1 commit

  • Define three accessors to get/set dst attached to a skb

    struct dst_entry *skb_dst(const struct sk_buff *skb)

    void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

    void skb_dst_drop(struct sk_buff *skb)
    This one should replace occurrences of :
    dst_release(skb->dst)
    skb->dst = NULL;

    Delete skb->dst field

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

25 Mar, 2008

1 commit


29 Jan, 2008

5 commits

  • The IPv4 and IPv6 hook values are identical, yet some code tries to figure
    out the "correct" value by looking at the address family. Introduce NF_INET_*
    values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__
    section for userspace compatibility.

    Signed-off-by: Patrick McHardy
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • The nhoff field isn't actually necessary in xfrm_input. For tunnel
    mode transforms we now throw away the output IP header so it makes no
    sense to fill in the nexthdr field. For transport mode we can now let
    the function transport_finish do the setting and it knows where the
    nexthdr field is.

    The only other thing that needs the nexthdr field to be set is the
    header extraction code. However, we can simply move the protocol
    extraction out of the generic header extraction.

    We want to minimise the amount of info we have to carry around between
    transforms as this simplifies the resumption process for async crypto.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • As part of the work on asynchrnous cryptographic operations, we need
    to be able to resume from the spot where they occur. As such, it
    helps if we isolate them to one spot.

    This patch moves most of the remaining family-specific processing into
    the common output code.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Most callers of the LOCAL_OUT chain will set the IP packet length and
    header checksum before doing so. They also share the same output
    function dst_output.

    This patch creates a new function called ip_local_out which does all
    of that and converts the appropriate users over to it.

    Apart from removing duplicate code, it will also help in merging the
    IPsec output path once the same thing is done for IPv6.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • With inter-family transforms the inner mode differs from the outer
    mode. Attempting to handle both sides from the same function means
    that it needs to handle both IPv4 and IPv6 which creates duplication
    and confusion.

    This patch separates the two parts on the output path so that each
    function deals with one family only.

    In particular, the functions xfrm4_extract_output/xfrm6_extract_output
    moves the pertinent fields from the IPv4/IPv6 IP headers into a
    neutral format stored in skb->cb. This is then used by the outer mode
    output functions to write the outer IP header. In this way the output
    function no longer has to know about the inner address family.

    Since the extract functions are only called by tunnel modes (the only
    modes that can support inter-family transforms), I've also moved the
    xfrm*_tunnel_check_size calls into them. This allows the correct ICMP
    message to be sent as opposed to now where you might call icmp_send
    with an IPv6 packet and vice versa.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

18 Oct, 2007

2 commits

  • This patch adds a new field to xfrm states called inner_mode. The existing
    mode object is renamed to outer_mode.

    This is the first part of an attempt to fix inter-family transforms. As it
    is we always use the outer family when determining which mode to use. As a
    result we may end up shoving IPv4 packets into netfilter6 and vice versa.

    What we really want is to use the inner family for the first part of outbound
    processing and the outer family for the second part. For inbound processing
    we'd use the opposite pairing.

    I've also added a check to prevent silly combinations such as transport mode
    with inter-family transforms.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Currently BEET mode does not reinject the packet back into the stack
    like tunnel mode does. Since BEET should behave just like tunnel mode
    this is incorrect.

    This patch fixes this by introducing a flags field to xfrm_mode that
    tells the IPsec code whether it should terminate and reinject the packet
    back into the stack.

    It then sets the flag for BEET and tunnel mode.

    I've also added a number of missing BEET checks elsewhere where we check
    whether a given mode is a tunnel or not.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

16 Oct, 2007

1 commit


11 Oct, 2007

2 commits

  • This patch moves the setting of the IP length and checksum fields out of
    the transforms and into the xfrmX_output functions. This would help future
    efforts in merging the transforms themselves.

    It also adds an optimisation to ipcomp due to the fact that the transport
    offset is guaranteed to be zero.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Most of the code in xfrm4_output_one and xfrm6_output_one are identical so
    this patch moves them into a common xfrm_output function which will live
    in net/xfrm.

    In fact this would seem to fix a bug as on IPv4 we never reset the network
    header after a transform which may upset netfilter later on.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

26 Apr, 2007

1 commit


11 Feb, 2007

1 commit


23 Sep, 2006

3 commits


09 Jul, 2006

1 commit

  • This patch adds the wrapper function skb_is_gso which can be used instead
    of directly testing skb_shinfo(skb)->gso_size. This makes things a little
    nicer and allows us to change the primary key for indicating whether an skb
    is GSO (if we ever want to do that).

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu