17 Mar, 2016

1 commit


16 Jan, 2016

1 commit

  • Skb_gso_segment() uses skb control block during segmentation.
    This patch adds 32-bytes room for previous control block which
    will be copied into all resulting segments.

    This patch fixes kernel crash during fragmenting forwarded packets.
    Fragmentation requires valid IP CB in skb for clearing ip options.
    Also patch removes custom save/restore in ovs code, now it's redundant.

    Signed-off-by: Konstantin Khlebnikov
    Link: http://lkml.kernel.org/r/CALYGNiP-0MZ-FExV2HutTvE9U-QQtkKSoE--KN=JQE5STYsjAA@mail.gmail.com
    Signed-off-by: David S. Miller

    Konstantin Khlebnikov
     

08 Oct, 2015

3 commits


18 Sep, 2015

4 commits

  • In code review it was noticed that I had failed to add some blank lines
    in places where they are customarily used. Taking a second look at the
    code I have to agree blank lines would be nice so I have added them
    here.

    Reported-by: Nicolas Dichtel
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This is immediately motivated by the bridge code that chains functions that
    call into netfilter. Without passing net into the okfns the bridge code would
    need to guess about the best expression for the network namespace to process
    packets in.

    As net is frequently one of the first things computed in continuation functions
    after netfilter has done it's job passing in the desired network namespace is in
    many cases a code simplification.

    To support this change the function dst_output_okfn is introduced to
    simplify passing dst_output as an okfn. For the moment dst_output_okfn
    just silently drops the struct net.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Pass a network namespace parameter into the netfilter hooks. At the
    call site of the netfilter hooks the path a packet is taking through
    the network stack is well known which allows the network namespace to
    be easily and reliabily.

    This allows the replacement of magic code like
    "dev_net(state->in?:state->out)" that appears at the start of most
    netfilter hooks with "state->net".

    In almost all cases the network namespace passed in is derived
    from the first network device passed in, guaranteeing those
    paths will not see any changes in practice.

    The exceptions are:
    xfrm/xfrm_output.c:xfrm_output_resume() xs_net(skb_dst(skb)->xfrm)
    ipvs/ip_vs_xmit.c:ip_vs_nat_send_or_cont() ip_vs_conn_net(cp)
    ipvs/ip_vs_xmit.c:ip_vs_send_or_cont() ip_vs_conn_net(cp)
    ipv4/raw.c:raw_send_hdrinc() sock_net(sk)
    ipv6/ip6_output.c:ip6_xmit() sock_net(sk)
    ipv6/ndisc.c:ndisc_send_skb() dev_net(skb->dev) not dev_net(dst->dev)
    ipv6/raw.c:raw6_send_hdrinc() sock_net(sk)
    br_netfilter_hooks.c:br_nf_pre_routing_finish() dev_net(skb->dev) before skb->dev is set to nf_bridge->physindev

    In all cases these exceptions seem to be a better expression for the
    network namespace the packet is being processed in then the historic
    "dev_net(in?in:out)". I am documenting them in case something odd
    pops up and someone starts trying to track down what happened.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Add a sock paramter to dst_output making dst_output_sk superfluous.
    Add a skb->sk parameter to all of the callers of dst_output
    Have the callers of dst_output_sk call dst_output.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

13 May, 2015

1 commit


08 Apr, 2015

1 commit

  • On the output paths in particular, we have to sometimes deal with two
    socket contexts. First, and usually skb->sk, is the local socket that
    generated the frame.

    And second, is potentially the socket used to control a tunneling
    socket, such as one the encapsulates using UDP.

    We do not want to disassociate skb->sk when encapsulating in order
    to fix this, because that would break socket memory accounting.

    The most extreme case where this can cause huge problems is an
    AF_PACKET socket transmitting over a vxlan device. We hit code
    paths doing checks that assume they are dealing with an ipv4
    socket, but are actually operating upon the AF_PACKET one.

    Signed-off-by: David S. Miller

    David Miller
     

21 Oct, 2014

1 commit

  • skb_gso_segment has three possible return values:
    1. a pointer to the first segmented skb
    2. an errno value (IS_ERR())
    3. NULL. This can happen when GSO is used for header verification.

    However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL
    and would oops when NULL is returned.

    Note that these call sites should never actually see such a NULL return
    value; all callers mask out the GSO bits in the feature argument.

    However, there have been issues with some protocol handlers erronously not
    respecting the specified feature mask in some cases.

    It is preferable to get 'have to turn off hw offloading, else slow' reports
    rather than 'kernel crashes'.

    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     

10 Sep, 2014

1 commit


13 May, 2014

1 commit


19 Aug, 2013

1 commit

  • We need to choose the protocol family by skb->protocol. Otherwise we
    call the wrong xfrm{4,6}_local_error handler in case an ipv6 sockets is
    used in ipv4 mode, in which case we should call down to xfrm4_local_error
    (ip6 sockets are a superset of ip4 ones).

    We are called before before ip_output functions, so skb->protocol is
    not reset.

    Cc: Steffen Klassert
    Acked-by: Eric Dumazet
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: Steffen Klassert

    Hannes Frederic Sowa
     

14 Aug, 2013

1 commit

  • In xfrm4 and xfrm6 we need to take care about sockets of the other
    address family. This could happen because a 6in4 or 4in6 tunnel could
    get protected by ipsec.

    Because we don't want to have a run-time dependency on ipv6 when only
    using ipv4 xfrm we have to embed a pointer to the correct local_error
    function in xfrm_state_afinet and look it up when returning an error
    depending on the socket address family.

    Thanks to vi0ss for the great bug report:

    v2:
    a) fix two more unsafe interpretations of skb->sk as ipv6 socket
    (xfrm6_local_dontfrag and __xfrm6_output)
    v3:
    a) add an EXPORT_SYMBOL_GPL(xfrm_local_error) to fix a link error when
    building ipv6 as a module (thanks to Steffen Klassert)

    Reported-by:
    Cc: Steffen Klassert
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: Steffen Klassert

    Hannes Frederic Sowa
     

05 Jun, 2013

1 commit


23 May, 2013

1 commit

  • The error exit path needs err explicitly set. Otherwise it
    returns success and the only caller, xfrm_output_resume(),
    would oops in skb_dst(skb)->ops derefence as skb_dst(skb) is
    NULL.

    Bug introduced in commit bb65a9cb (xfrm: removes a superfluous
    check and add a statistic).

    Signed-off-by: Timo Teräs
    Cc: Li RongQing
    Cc: Steffen Klassert
    Signed-off-by: David S. Miller

    Timo Teräs
     

01 Feb, 2013

1 commit


07 Jan, 2013

1 commit

  • Remove the check if x->km.state equal to XFRM_STATE_VALID in
    xfrm_state_check_expire(), which will be done before call
    xfrm_state_check_expire().

    add a LINUX_MIB_XFRMOUTSTATEINVALID statistic to record the
    outbound error due to invalid xfrm state.

    Signed-off-by: Li RongQing
    Signed-off-by: Steffen Klassert

    Li RongQing
     

23 Mar, 2012

1 commit


28 Mar, 2011

2 commits


14 Mar, 2011

2 commits


17 Sep, 2010

1 commit


05 Jun, 2010

1 commit

  • xfrm triggers a warning if dst_pop() drops a refcount
    on a noref dst. This patch changes dst_pop() to
    skb_dst_pop(). skb_dst_pop() drops the refcnt only
    on a refcounted dst. Also we don't clone the child
    dst_entry, so it is not refcounted and we can use
    skb_dst_set_noref() in xfrm_output_one().

    Signed-off-by: Steffen Klassert
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Steffen Klassert
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

03 Jun, 2009

1 commit

  • Define three accessors to get/set dst attached to a skb

    struct dst_entry *skb_dst(const struct sk_buff *skb)

    void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

    void skb_dst_drop(struct sk_buff *skb)
    This one should replace occurrences of :
    dst_release(skb->dst)
    skb->dst = NULL;

    Delete skb->dst field

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

26 Nov, 2008

2 commits


30 Sep, 2008

1 commit


14 Aug, 2008

1 commit


13 May, 2008

1 commit

  • This patch adds needed_headroom/needed_tailroom members to struct
    net_device and updates many places that allocate sbks to use them. Not
    all of them can be converted though, and I'm sure I missed some (I
    mostly grepped for LL_RESERVED_SPACE)

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

29 Apr, 2008

1 commit

  • Some drivers have duplicated unlikely() macros. IS_ERR() already has
    unlikely() in itself.

    This patch cleans up such pointless code.

    Signed-off-by: Hirofumi Nakagawa
    Acked-by: David S. Miller
    Acked-by: Jeff Garzik
    Cc: Paul Clements
    Cc: Richard Purdie
    Cc: Alessandro Zummo
    Cc: David Brownell
    Cc: James Bottomley
    Cc: Michael Halcrow
    Cc: Anton Altaparmakov
    Cc: Al Viro
    Cc: Carsten Otte
    Cc: Patrick McHardy
    Cc: Paul Mundt
    Cc: Jaroslav Kysela
    Cc: Takashi Iwai
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hirofumi Nakagawa
     

25 Mar, 2008

1 commit


13 Feb, 2008

1 commit


01 Feb, 2008

1 commit

  • o Outbound sequence number overflow error status
    is counted as XfrmOutStateSeqError.
    o Additionaly, it changes inbound sequence number replay
    error name from XfrmInSeqOutOfWindow to XfrmInStateSeqError
    to apply name scheme above.
    o Inbound IPv4 UDP encapsuling type mismatch error is wrongly
    mapped to XfrmInStateInvalid then this patch fiex the error
    to XfrmInStateMismatch.

    Signed-off-by: Masahide NAKAMURA
    Signed-off-by: David S. Miller

    Masahide NAKAMURA
     

29 Jan, 2008

2 commits