22 Jun, 2019

1 commit


02 Apr, 2019

1 commit

  • Configuration check to accept source route IP options should be made on
    the incoming netdevice when the skb->dev is an l3mdev master. The route
    lookup for the source route next hop also needs the incoming netdev.

    v2->v3:
    - Simplify by passing the original netdevice down the stack (per David
    Ahern).

    Signed-off-by: Stephen Suryaputra
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Stephen Suryaputra
     

26 Feb, 2019

1 commit


02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

07 Aug, 2017

2 commits

  • __ip_options_echo() uses the current network namespace, and
    currently retrives it via skb->dst->dev.

    This commit adds an explicit 'net' argument to __ip_options_echo()
    and update all the call sites to provide it, usually via a simpler
    sock_net().

    After this change, __ip_options_echo() no more needs to access
    skb->dst and we can drop a couple of hack to preserve such
    info in the rx path.

    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     
  • While computing the response option set for LSRR, ip_options_echo()
    also changes the ingress packet LSRR addresses list, setting
    the last one to the dst specific address for the ingress packet
    - via memset(start[ ...
    The only visible effect of such change - beyond possibly damaging
    shared/cloned skbs - is modifying the data carried by ICMP replies
    changing the header information for reported the ingress packet,
    which violates RFC1122 3.2.2.6.
    All the others call sites just ignore the ingress packet IP options
    after calling ip_options_echo()
    Note that the last element in the LSRR option address list for the
    reply packet will be properly set later in the ip output path
    via ip_options_build().
    This buggy memset() predates git history and apparently was present
    into the initial ip_options_echo() implementation in linux 1.3.30 but
    still looks wrong.

    The removal of the fib_compute_spec_dst() call will help
    completely dropping the skb->dst usage by __ip_options_echo() with a
    later patch.

    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     

25 Dec, 2016

1 commit


02 Mar, 2016

1 commit

  • ICMP timestamp messages and IP source route options require
    timestamps to be in milliseconds modulo 24 hours from
    midnight UT format.

    Add inet_current_timestamp() function to support this. The function
    returns the required timestamp in network byte order.

    Timestamp calculation is also changed to call ktime_get_real_ts64()
    which uses struct timespec64. struct timespec64 is y2038 safe.
    Previously it called getnstimeofday() which uses struct timespec.
    struct timespec is not y2038 safe.

    Signed-off-by: Deepa Dinamani
    Cc: "David S. Miller"
    Cc: Alexey Kuznetsov
    Cc: Hideaki YOSHIFUJI
    Cc: James Morris
    Cc: Patrick McHardy
    Acked-by: YOSHIFUJI Hideaki
    Acked-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Deepa Dinamani
     

04 Apr, 2015

1 commit

  • The ipv4 code uses a mixture of coding styles. In some instances check
    for non-NULL pointer is done as x != NULL and sometimes as x. x is
    preferred according to checkpatch and this patch makes the code
    consistent by adopting the latter form.

    No changes detected by objdiff.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     

29 Sep, 2014

1 commit


22 Jul, 2014

1 commit

  • There is a benign buffer overflow in ip_options_compile spotted by
    AddressSanitizer[1] :

    Its benign because we always can access one extra byte in skb->head
    (because header is followed by struct skb_shared_info), and in this case
    this byte is not even used.

    [28504.910798] ==================================================================
    [28504.912046] AddressSanitizer: heap-buffer-overflow in ip_options_compile
    [28504.913170] Read of size 1 by thread T15843:
    [28504.914026] [] ip_options_compile+0x121/0x9c0
    [28504.915394] [] ip_options_get_from_user+0xad/0x120
    [28504.916843] [] do_ip_setsockopt.isra.15+0x8df/0x1630
    [28504.918175] [] ip_setsockopt+0x30/0xa0
    [28504.919490] [] tcp_setsockopt+0x5b/0x90
    [28504.920835] [] sock_common_setsockopt+0x5f/0x70
    [28504.922208] [] SyS_setsockopt+0xa2/0x140
    [28504.923459] [] system_call_fastpath+0x16/0x1b
    [28504.924722]
    [28504.925106] Allocated by thread T15843:
    [28504.925815] [] ip_options_get_from_user+0x35/0x120
    [28504.926884] [] do_ip_setsockopt.isra.15+0x8df/0x1630
    [28504.927975] [] ip_setsockopt+0x30/0xa0
    [28504.929175] [] tcp_setsockopt+0x5b/0x90
    [28504.930400] [] sock_common_setsockopt+0x5f/0x70
    [28504.931677] [] SyS_setsockopt+0xa2/0x140
    [28504.932851] [] system_call_fastpath+0x16/0x1b
    [28504.934018]
    [28504.934377] The buggy address ffff880026382828 is located 0 bytes to the right
    [28504.934377] of 40-byte region [ffff880026382800, ffff880026382828)
    [28504.937144]
    [28504.937474] Memory state around the buggy address:
    [28504.938430] ffff880026382300: ........ rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.939884] ffff880026382400: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.941294] ffff880026382500: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.942504] ffff880026382600: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.943483] ffff880026382700: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.944511] >ffff880026382800: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
    [28504.945573] ^
    [28504.946277] ffff880026382900: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.094949] ffff880026382a00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.096114] ffff880026382b00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.097116] ffff880026382c00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.098472] ffff880026382d00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
    [28505.099804] Legend:
    [28505.100269] f - 8 freed bytes
    [28505.100884] r - 8 redzone bytes
    [28505.101649] . - 8 allocated bytes
    [28505.102406] x=1..7 - x allocated bytes + (8-x) redzone bytes
    [28505.103637] ==================================================================

    [1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

29 Apr, 2014

1 commit


02 Jan, 2014

1 commit


27 Dec, 2013

1 commit


12 Mar, 2013

1 commit

  • This is needed in order to detect if the timestamp option appears
    more than once in a packet, to remove the option if the packet is
    fragmented, etc. My previous change neglected to store the option
    location when the router addresses were prespecified and Pointer >
    Length. But now the option location is also stored when Flag is an
    unrecognized value, to ensure these option handling behaviors are
    still performed.

    Signed-off-by: David Ward
    Signed-off-by: David S. Miller

    David Ward
     

06 Mar, 2013

1 commit

  • When a router forwards a packet that contains the IPv4 timestamp option,
    if there is no space left in the option for the router to add its own
    timestamp, then the router increments the Overflow value in the option.

    However, if the addresses of the routers are prespecified in the option,
    then the overflow condition cannot happen: the option is structured so
    that each prespecified router has a place to write its timestamp. Other
    routers do not add a timestamp, so there will never be a lack of space.

    This fix ensures that the Overflow value in the IPv4 timestamp option is
    not incremented when the addresses of the routers are prespecified, even
    if the Pointer value is greater than the Length value.

    Signed-off-by: David Ward
    Signed-off-by: David S. Miller

    David Ward
     

19 Nov, 2012

1 commit

  • Allow an unpriviled user who has created a user namespace, and then
    created a network namespace to effectively use the new network
    namespace, by reducing capable(CAP_NET_ADMIN) and
    capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
    CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.

    Settings that merely control a single network device are allowed.
    Either the network device is a logical network device where
    restrictions make no difference or the network device is hardware NIC
    that has been explicity moved from the initial network namespace.

    In general policy and network stack state changes are allowed
    while resource control is left unchanged.

    Allow creating raw sockets.
    Allow the SIOCSARP ioctl to control the arp cache.
    Allow the SIOCSIFFLAG ioctl to allow setting network device flags.
    Allow the SIOCSIFADDR ioctl to allow setting a netdevice ipv4 address.
    Allow the SIOCSIFBRDADDR ioctl to allow setting a netdevice ipv4 broadcast address.
    Allow the SIOCSIFDSTADDR ioctl to allow setting a netdevice ipv4 destination address.
    Allow the SIOCSIFNETMASK ioctl to allow setting a netdevice ipv4 netmask.
    Allow the SIOCADDRT and SIOCDELRT ioctls to allow adding and deleting ipv4 routes.

    Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
    adding, changing and deleting gre tunnels.

    Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
    adding, changing and deleting ipip tunnels.

    Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
    adding, changing and deleting ipsec virtual tunnel interfaces.

    Allow setting the MRT_INIT, MRT_DONE, MRT_ADD_VIF, MRT_DEL_VIF, MRT_ADD_MFC,
    MRT_DEL_MFC, MRT_ASSERT, MRT_PIM, MRT_TABLE socket options on multicast routing
    sockets.

    Allow setting and receiving IPOPT_CIPSO, IP_OPT_SEC, IP_OPT_SID and
    arbitrary ip options.

    Allow setting IP_SEC_POLICY/IP_XFRM_POLICY ipv4 socket option.
    Allow setting the IP_TRANSPARENT ipv4 socket option.
    Allow setting the TCP_REPAIR socket option.
    Allow setting the TCP_CONGESTION socket option.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

19 Jul, 2012

1 commit


05 Jul, 2012

2 commits


28 Jun, 2012

1 commit

  • The specific destination is the host we direct unicast replies to.
    Usually this is the original packet source address, but if we are
    responding to a multicast or broadcast packet we have to use something
    different.

    Specifically we must use the source address we would use if we were to
    send a packet to the unicast source of the original packet.

    The routing cache precomputes this value, but we want to remove that
    precomputation because it creates a hard dependency on the expensive
    rpfilter source address validation which we'd like to make cheaper.

    There are only three places where this matters:

    1) ICMP replies.

    2) pktinfo CMSG

    3) IP options

    Now there will be no real users of rt->rt_spec_dst and we can simply
    remove it altogether.

    Signed-off-by: David S. Miller

    David S. Miller
     

16 May, 2012

1 commit


16 Apr, 2012

2 commits


13 Mar, 2012

1 commit

  • Add #define pr_fmt(fmt) as appropriate.

    Add "IPv4: ", "TCP: ", and "IPsec: " to appropriate files.
    Standardize on "UDPLite: " for appropriate uses.
    Some prefixes were previously "UDPLITE: " and "UDP-Lite: ".

    Add KBUILD_MODNAME ": " to icmp and gre.
    Remove embedded prefixes as appropriate.

    Add missing "\n" to pr_info in gre.c.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

12 Mar, 2012

1 commit

  • Use a more current kernel messaging style.

    Convert a printk block to print_hex_dump.
    Coalesce formats, align arguments.
    Use %s, __func__ instead of embedding function names.

    Some messages that were prefixed with _close are
    now prefixed with _fini. Some ah4 and esp messages
    are now not prefixed with "ip ".

    The intent of this patch is to later add something like
    #define pr_fmt(fmt) "IPv4: " fmt.
    to standardize the output messages.

    Text size is trivially reduced. (x86-32 allyesconfig)

    $ size net/ipv4/built-in.o*
    text data bss dec hex filename
    887888 31558 249696 1169142 11d6f6 net/ipv4/built-in.o.new
    887934 31558 249800 1169292 11d78c net/ipv4/built-in.o.old

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

11 Feb, 2012

1 commit

  • This patch fix a bug which introduced by commit ac8a4810 (ipv4: Save
    nexthop address of LSRR/SSRR option to IPCB.).In that patch, we saved
    the nexthop of SRR in ip_option->nexthop and update iph->daddr until
    we get to ip_forward_options(), but we need to update it before
    ip_rt_get_source(), otherwise we may get a wrong src.

    Signed-off-by: Li Wei
    Signed-off-by: David S. Miller

    Li Wei
     

24 Nov, 2011

1 commit

  • We can not update iph->daddr in ip_options_rcv_srr(), It is too early.
    When some exception ocurred later (eg. in ip_forward() when goto
    sr_failed) we need the ip header be identical to the original one as
    ICMP need it.

    Add a field 'nexthop' in struct ip_options to save nexthop of LSRR
    or SSRR option.

    Signed-off-by: Li Wei
    Signed-off-by: David S. Miller

    Li Wei
     

10 Nov, 2011

1 commit


01 Jun, 2011

1 commit

  • The current code takes an unaligned pointer and does htonl() on it to
    make it big-endian, then does a memcpy(). The problem is that the
    compiler decides that since the pointer is to a __be32, it is legal
    to optimize the copy into a processor word store. However, on an
    architecture that does not handled unaligned writes in kernel space,
    this produces an unaligned exception fault.

    The solution is to track the pointer as a "char *" (which removes a bunch
    of unpleasant casts in any case), and then just use put_unaligned_be32()
    to write the value to memory.

    Signed-off-by: Chris Metcalf
    Signed-off-by: David S. Miller

    Chris Metcalf
     

14 May, 2011

3 commits

  • At this point iph->daddr equals what rt->rt_dst would hold.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Pass in the sk_buff so that we can fetch the necessary keys from
    the packet header when working with input routes.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • This code block executes when opt->srr_is_hit is set. It will be
    set only by ip_options_rcv_srr().

    ip_options_rcv_srr() walks until it hits a matching nexthop in the SRR
    option addresses, and when it matches one 1) looks up the route for
    that nexthop and 2) on route lookup success it writes that nexthop
    value into iph->daddr.

    ip_forward_options() runs later, and again walks the SRR option
    addresses looking for the option matching the destination of the route
    stored in skb_rtable(). This route will be the same exact one looked
    up for the nexthop by ip_options_rcv_srr().

    Therefore "rt->rt_dst == iph->daddr" must be true.

    All it really needs to do is record the route's source address in the
    matching SRR option adddress. It need not write iph->daddr again,
    since that has already been done by ip_options_rcv_srr() as detailed
    above.

    Signed-off-by: David S. Miller

    David S. Miller
     

13 May, 2011

2 commits


29 Apr, 2011

1 commit

  • We lack proper synchronization to manipulate inet->opt ip_options

    Problem is ip_make_skb() calls ip_setup_cork() and
    ip_setup_cork() possibly makes a copy of ipc->opt (struct ip_options),
    without any protection against another thread manipulating inet->opt.

    Another thread can change inet->opt pointer and free old one under us.

    Use RCU to protect inet->opt (changed to inet->inet_opt).

    Instead of handling atomic refcounts, just copy ip_options when
    necessary, to avoid cache line dirtying.

    We cant insert an rcu_head in struct ip_options since its included in
    skb->cb[], so this patch is large because I had to introduce a new
    ip_options_rcu structure.

    Signed-off-by: Eric Dumazet
    Cc: Herbert Xu
    Signed-off-by: David S. Miller

    Eric Dumazet
     

15 Apr, 2011

1 commit

  • Scot Doyle demonstrated ip_options_compile() could be called with an skb
    without an attached route, using a setup involving a bridge, netfilter,
    and forged IP packets.

    Let's make ip_options_compile() and ip_options_rcv_srr() a bit more
    robust, instead of changing bridge/netfilter code.

    With help from Hiroaki SHIMODA.

    Reported-by: Scot Doyle
    Tested-by: Scot Doyle
    Signed-off-by: Eric Dumazet
    Cc: Stephen Hemminger
    Acked-by: Hiroaki SHIMODA
    Signed-off-by: David S. Miller

    Eric Dumazet
     

28 Mar, 2011

1 commit

  • The current handling of echoed IP timestamp options with prespecified
    addresses is rather broken since the 2.2.x kernels. As far as i understand
    it, it should behave like when originating packets.

    Currently it will only timestamp the next free slot if:
    - there is space for *two* timestamps
    - some random data from the echoed packet taken as an IP is *not* a local IP

    This first is caused by an off-by-one error. 'soffset' points to the next
    free slot and so we only need to have 'soffset + 7
    Signed-off-by: David S. Miller

    Jan Luebbe
     

20 Sep, 2010

1 commit

  • Related dicussion here : http://lkml.org/lkml/2010/9/3/16

    Introduce a function br_parse_ip_options that will audit the
    skb and possibly refill IP options before a packet enters the
    IP stack. If no options are present, the function will zero out
    the skb cb area so that it is not misinterpreted as options by some
    unsuspecting IP layer routine. If packet consistency fails, drop it.

    Signed-off-by: Bandan Das
    Signed-off-by: David S. Miller

    Bandan Das
     

18 May, 2010

1 commit

  • This patch removes from net/ (but not any netfilter files)
    all the unnecessary return; statements that precede the
    last closing brace of void functions.

    It does not remove the returns that are immediately
    preceded by a label as gcc doesn't like that.

    Done via:
    $ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
    xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches