Eric Lee / smarc-fsl-linux-kernel

11 Sep, 2016

1 commit

d66f6c0a8 net: ipv4: Remove l3mdev_get_saddr ... Browse Code »

No longer needed

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2016-09-11 14:12:53 +0800

24 Apr, 2016

1 commit

1602f49b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts were two cases of simple overlapping changes,
nothing serious.

In the UDP case, we need to add a hlist_add_tail_rcu()
to linux/rculist.h, because we've moved UDP socket handling
away from using nulls lists.

Signed-off-by: David S. Miller

David S. Miller
2016-04-24 06:51:33 +0800

12 Apr, 2016

1 commit

9ab179d83 net: vrf: Fix dst reference counting ... Browse Code »

Vivek reported a kernel exception deleting a VRF with an active
connection through it. The root cause is that the socket has a cached
reference to a dst that is destroyed. Converting the dst_destroy to
dst_release and letting proper reference counting kick in does not
work as the dst has a reference to the device which needs to be released
as well.

I talked to Hannes about this at netdev and he pointed out the ipv4 and
ipv6 dst handling has dst_ifdown for just this scenario. Rather than
continuing with the reinvented dst wheel in VRF just remove it and
leverage the ipv4 and ipv6 versions.

Fixes: 193125dbd8eb2 ("net: Introduce VRF device driver")
Fixes: 35402e3136634 ("net: Add IPv6 support to VRF device")

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2016-04-12 03:56:20 +0800

08 Apr, 2016

1 commit

0340d0b9e net: Checks skb_dst to be NULL in inet_iif ... Browse Code »

In inet_iif check if skb_rtable is NULL for the skb and return
skb->skb_iif if it is.

This change allows inet_iif to be called before the dst
information has been set in the skb (e.g. when doing socket based
UDP GRO).

Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller

Tom Herbert
2016-04-08 04:53:14 +0800

17 Feb, 2016

1 commit

fa50d974d ipv4: Namespaceify ip_default_ttl sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-17 09:42:54 +0800

05 Jan, 2016

1 commit

b5bdacf3b net: Propagate lookup failure in l3mdev_get_saddr to caller ... Browse Code »

Commands run in a vrf context are not failing as expected on a route lookup:
root@kenny:~# ip ro ls table vrf-red
unreachable default

root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
ping: Warning: source address might be selected on device other than vrf-red.
PING 10.100.1.254 (10.100.1.254) from 0.0.0.0 vrf-red: 56(84) bytes of data.

--- 10.100.1.254 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms

Since the vrf table does not have a route for 10.100.1.254 the ping
should have failed. The saddr lookup causes a full VRF table lookup.
Propogating a lookup failure to the user allows the command to fail as
expected:

root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
connect: No route to host

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2016-01-05 11:58:30 +0800

07 Oct, 2015

2 commits

8cbb512c9 net: Add source address lookup op for VRF ... Browse Code »

Add operation to l3mdev to lookup source address for a given flow.
Add support for the operation to VRF driver and convert existing
IPv4 hooks to use the new lookup.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2015-10-07 19:27:44 +0800
6e2895a8e net: Rename FLOWI_FLAG_VRFSRC to FLOWI_FLAG_L3MDEV_SRC ... Browse Code »

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2015-10-07 19:27:42 +0800

05 Oct, 2015

1 commit

79a131592 ipv4: ICMP packet inspection for multipath ... Browse Code »

ICMP packets are inspected to let them route together with the flow they
belong to, minimizing the chance that a problematic path will affect flows
on other paths, and so that anycast environments can work with ECMP.

Signed-off-by: Peter Nørlund
Signed-off-by: David S. Miller

Peter Nørlund
2015-10-05 18:00:04 +0800

30 Sep, 2015

2 commits

9478d12d3 net: Move netif_index_is_l3_master to l3mdev.h ... Browse Code »

Change CONFIG dependency to CONFIG_NET_L3_MASTER_DEV as well.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2015-09-30 11:40:34 +0800
007979eaf net: Rename IFF_VRF_MASTER to IFF_L3MDEV_MASTER ... Browse Code »

Rename IFF_VRF_MASTER to IFF_L3MDEV_MASTER and update the name of the
netif_is_vrf and netif_index_is_vrf macros.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2015-09-30 11:40:32 +0800

27 Sep, 2015

1 commit

4963ed48f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
net/ipv4/arp.c

The net/ipv4/arp.c conflict was one commit adding a new
local variable while another commit was deleting one.

Signed-off-by: David S. Miller

David S. Miller
2015-09-27 07:08:27 +0800

26 Sep, 2015

1 commit

6f9c96154 inet: constify ip_route_output_flow() socket argument ... Browse Code »

Very soon, TCP stack might call inet_csk_route_req(), which
calls inet_csk_route_req() with an unlocked listener socket,
so we need to make sure ip_route_output_flow() is not trying to
change any field from its socket argument.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2015-09-26 04:00:37 +0800

18 Sep, 2015

1 commit

58189ca7b net: Fix vti use case with oif in dst lookups ... Browse Code »

Steffen reported that the recent change to add oif to dst lookups breaks
the VTI use case. The problem is that with the oif set in the flow struct
the comparison to the nh_oif is triggered. Fix by splitting the
FLOWI_FLAG_VRFSRC into 2 flags -- one that triggers the vrf device cache
bypass (FLOWI_FLAG_VRFSRC) and another telling the lookup to not compare
nh oif (FLOWI_FLAG_SKIP_NH_OIF).

Fixes: 42a7b32b73d6 ("xfrm: Add oif to dst lookups")

Signed-off-by: David Ahern
Acked-by: Steffen Klassert
Signed-off-by: David S. Miller

David Ahern
2015-09-18 07:36:34 +0800

16 Sep, 2015

1 commit

b7503e0cd net: Add FIB table id to rtable ... Browse Code »

Add the FIB table id to rtable to make the information available for
IPv4 as it is for IPv6.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2015-09-16 03:01:41 +0800

02 Sep, 2015

1 commit

9b8ff5182 net: Make table id type u32 ... Browse Code »

A number of VRF patches used 'int' for table id. It should be u32 to be
consistent with the rest of the stack.

Fixes:
4e3c89920cd3a ("net: Introduce VRF related flags and helpers")
15be405eb2ea9 ("net: Add inet_addr lookup by table")
30bbaa1950055 ("net: Fix up inet_addr_type checks")
021dd3b8a142d ("net: Add routes to the table associated with the device")
dc028da54ed35 ("inet: Move VRF table lookup to inlined function")
f6d3c19274c74 ("net: FIB tracepoints")

Signed-off-by: David Ahern
Reviewed-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

David Ahern
2015-09-02 05:32:44 +0800

21 Aug, 2015

1 commit

61adedf3e route: move lwtunnel state to dst_entry ... Browse Code »

Currently, the lwtunnel state resides in per-protocol data. This is
a problem if we encapsulate ipv6 traffic in an ipv4 tunnel (or vice versa).
The xmit function of the tunnel does not know whether the packet has been
routed to it by ipv4 or ipv6, yet it needs the lwtstate data. Moving the
lwtstate data to dst_entry makes such inter-protocol tunneling possible.

As a bonus, this brings a nice diffstat.

Signed-off-by: Jiri Benc
Acked-by: Roopa Prabhu
Acked-by: Thomas Graf
Signed-off-by: David S. Miller

Jiri Benc
2015-08-21 06:42:36 +0800

14 Aug, 2015

3 commits

30bbaa195 net: Fix up inet_addr_type checks ... Browse Code »

Currently inet_addr_type and inet_dev_addr_type expect local addresses
to be in the local table. With the VRF device local routes for devices
associated with a VRF will be in the table associated with the VRF.
Provide an alternate inet_addr lookup to use a specific table rather
than defaulting to the local table.

inet_addr_type_dev_table keeps the same semantics as inet_addr_type but
if the passed in device is enslaved to a VRF then the table for that VRF
is used for the lookup.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2015-08-14 13:43:21 +0800
15be405eb net: Add inet_addr lookup by table ... Browse Code »

Currently inet_addr_type and inet_dev_addr_type expect local addresses
to be in the local table. With the VRF device local routes for devices
associated with a VRF will be in the table associated with the VRF.
Provide an alternate inet_addr lookup to use a specific table rather
than defaulting to the local table.

Signed-off-by: Shrijeet Mukherjee
Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2015-08-14 13:43:21 +0800
613d09b30 net: Use VRF device index for lookups on TX ... Browse Code »

As with ingress use the index of VRF master device for route lookups on
egress. However, the oif should only be used to direct the lookups to a
specific table. Routes in the table are not based on the VRF device but
rather interfaces that are part of the VRF so do not consider the oif for
lookups within the table. The FLOWI_FLAG_VRFSRC is used to control this
latter part.

Signed-off-by: Shrijeet Mukherjee
Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2015-08-14 13:43:20 +0800

22 Jul, 2015

1 commit

571e72267 ipv4: support for fib route lwtunnel encap attributes ... Browse Code »

This patch adds support in ipv4 fib functions to parse user
provided encap attributes and attach encap state data to fib_nh
and rtable.

Signed-off-by: Roopa Prabhu
Signed-off-by: David S. Miller

Roopa Prabhu
2015-07-22 01:39:03 +0800

16 Jan, 2015

1 commit

5055c371b ipv4: per cpu uncached list ... Browse Code »

RAW sockets with hdrinc suffer from contention on rt_uncached_lock
spinlock.

One solution is to use percpu lists, since most routes are destroyed
by the cpu that created them.

It is unclear why we even have to put these routes in uncached_list,
as all outgoing packets should be freed when a device is dismantled.

Signed-off-by: Eric Dumazet
Fixes: caacf05e5ad1 ("ipv4: Properly purge netdev references on uncached routes.")
Signed-off-by: David S. Miller

Eric Dumazet
2015-01-16 07:26:16 +0800

25 Mar, 2014

1 commit

0b8c7f6f2 ipv4: remove ip_rt_dump from route.c ... Browse Code »

ip_rt_dump do nothing after IPv4 route caches removal, so we can remove it.

Signed-off-by: Li RongQing
Signed-off-by: David S. Miller

Li RongQing
2014-03-25 00:45:01 +0800

14 Jan, 2014

1 commit

f87c10a8a ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing ... Browse Code »

While forwarding we should not use the protocol path mtu to calculate
the mtu for a forwarded packet but instead use the interface mtu.

We mark forwarded skbs in ip_forward with IPSKB_FORWARDED, which was
introduced for multicast forwarding. But as it does not conflict with
our usage in unicast code path it is perfect for reuse.

I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu
along with the new ip_dst_mtu_maybe_forward to net/ip.h to fix circular
dependencies because of IPSKB_FORWARDED.

Because someone might have written a software which does probe
destinations manually and expects the kernel to honour those path mtus
I introduced a new per-namespace "ip_forward_use_pmtu" knob so someone
can disable this new behaviour. We also still use mtus which are locked on a
route for forwarding.

The reason for this change is, that path mtus information can be injected
into the kernel via e.g. icmp_err protocol handler without verification
of local sockets. As such, this could cause the IPv4 forwarding path to
wrongfully emit fragmentation needed notifications or start to fragment
packets along a path.

Tunnel and ipsec output paths clear IPCB again, thus IPSKB_FORWARDED
won't be set and further fragmentation logic will use the path mtu to
determine the fragmentation size. They also recheck packet size with
help of path mtu discovery and report appropriate errors.

Cc: Eric Dumazet
Cc: David Miller
Cc: John Heffner
Cc: Steffen Klassert
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Hannes Frederic Sowa
2014-01-14 03:22:54 +0800

06 Dec, 2013

1 commit

0e0d44ab4 net: Remove FLOWI_FLAG_CAN_SLEEP ... Browse Code »

FLOWI_FLAG_CAN_SLEEP was used to notify xfrm about the posibility
to sleep until the needed states are resolved. This code is gone,
so FLOWI_FLAG_CAN_SLEEP is not needed anymore.

Signed-off-by: Steffen Klassert

Steffen Klassert
2013-12-06 14:24:39 +0800

06 Nov, 2013

1 commit

482fc6094 ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE ... Browse Code »

Sockets marked with IP_PMTUDISC_INTERFACE won't do path mtu discovery,
their sockets won't accept and install new path mtu information and they
will always use the interface mtu for outgoing packets. It is guaranteed
that the packet is not fragmented locally. But we won't set the DF-Flag
on the outgoing frames.

Florian Weimer had the idea to use this flag to ensure DNS servers are
never generating outgoing fragments. They may well be fragmented on the
path, but the server never stores or usees path mtu values, which could
well be forged in an attack.

(The root of the problem with path MTU discovery is that there is
no reliable way to authenticate ICMP Fragmentation Needed But DF Set
messages because they are sent from intermediate routers with their
source addresses, and the IMCP payload will not always contain sufficient
information to identify a flow.)

Recent research in the DNS community showed that it is possible to
implement an attack where DNS cache poisoning is feasible by spoofing
fragments. This work was done by Amir Herzberg and Haya Shulman:

This issue was previously discussed among the DNS community, e.g.
,
without leading to fixes.

This patch depends on the patch "ipv4: fix DO and PROBE pmtu mode
regarding local fragmentation with UFO/CORK" for the enforcement of the
non-fragmentable checks. If other users than ip_append_page/data should
use this semantic too, we have to add a new flag to IPCB(skb)->flags to
suppress local fragmentation and check for this in ip_finish_output.

Many thanks to Florian Weimer for the idea and feedback while implementing
this patch.

Cc: David S. Miller
Suggested-by: Florian Weimer
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Hannes Frederic Sowa
2013-11-06 10:52:27 +0800

18 Oct, 2013

1 commit

0baf2b35f ipv4: shrink rt_cache_stat ... Browse Code »

Half of the rt_cache_stat fields are no longer used after IP
route cache removal, lets shrink this per cpu area.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2013-10-18 04:11:04 +0800

29 Sep, 2013

1 commit

aa6615814 ipv4: processing ancillary IP_TOS or IP_TTL ... Browse Code »

If IP_TOS or IP_TTL are specified as ancillary data, then sendmsg() sends out
packets with the specified TTL or TOS overriding the socket values specified
with the traditional setsockopt().

The struct inet_cork stores the values of TOS, TTL and priority that are
passed through the struct ipcm_cookie. If there are user-specified TOS
(tos != -1) or TTL (ttl != 0) in the struct ipcm_cookie, these values are
used to override the per-socket values. In case of TOS also the priority
is changed accordingly.

Two helper functions get_rttos and get_rtconn_flags are defined to take
into account the presence of a user specified TOS value when computing
RT_TOS and RT_CONN_FLAGS.

Signed-off-by: Francesco Fusco
Signed-off-by: David S. Miller

Francesco Fusco
2013-09-29 06:21:52 +0800

23 Sep, 2013

1 commit

2702c4bb8 route.h: Remove extern from function prototypes ... Browse Code »

There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.

Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.

Signed-off-by: Joe Perches
Signed-off-by: David S. Miller

Joe Perches
2013-09-23 13:51:08 +0800

14 Aug, 2013

1 commit

0ea9d5e3e xfrm: introduce helper for safe determination of mtu ... Browse Code »

skb->sk socket can be of AF_INET or AF_INET6 address family. Thus we
always have to make sure we a referring to the correct interpretation
of skb->sk.

We only depend on header defines to query the mtu, so we don't introduce
a new dependency to ipv6 by this change.

Cc: Steffen Klassert
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: Steffen Klassert

Hannes Frederic Sowa
2013-08-14 19:09:07 +0800

04 Nov, 2012

1 commit

6da025fa2 ipv4: avoid a test in ip_rt_put() ... Browse Code »

We can save a test in ip_rt_put(), considering dst_release() accepts
a NULL parameter, and dst is first element in rtable.

Add a BUILD_BUG_ON() to catch any change that could break this
assertion.

Signed-off-by: Eric Dumazet
Cc: Cong Wang
Acked-by: Cong Wang
Signed-off-by: David S. Miller

Eric Dumazet
2012-11-04 02:59:04 +0800

09 Oct, 2012

1 commit

155e8336c ipv4: introduce rt_uses_gateway ... Browse Code »

Add new flag to remember when route is via gateway.
We will use it to allow rt_gateway to contain address of
directly connected host for the cases when DST_NOCACHE is
used or when the NH exception caches per-destination route
without DST_NOCACHE flag, i.e. when routes are not used for
other destinations. By this way we force the neighbour
resolving to work with the routed destination but we
can use different address in the packet, feature needed
for IPVS-DR where original packet for virtual IP is routed
via route to real IP.

Signed-off-by: Julian Anastasov
Signed-off-by: David S. Miller

Julian Anastasov
2012-10-09 05:42:36 +0800

19 Sep, 2012

1 commit

bafa6d9d8 ipv4/route: arg delay is useless in rt_cache_flush() ... Browse Code »

Since route cache deletion (89aef8921bfbac22f), delay is no
more used. Remove it.

Signed-off-by: Nicolas Dichtel
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Nicolas Dichtel
2012-09-19 03:44:34 +0800

01 Aug, 2012

1 commit

caacf05e5 ipv4: Properly purge netdev references on uncached routes. ... Browse Code »
43

When a device is unregistered, we have to purge all of the
references to it that may exist in the entire system.

If a route is uncached, we currently have no way of accomplishing
this.

So create a global list that is scanned when a network device goes
down. This mirrors the logic in net/core/dst.c's dst_ifdown().

Signed-off-by: David S. Miller

David S. Miller
2012-08-01 06:06:50 +0800

27 Jul, 2012

1 commit

c6cffba4f ipv4: Fix input route performance regression. ... Browse Code »
43

With the routing cache removal we lost the "noref" code paths on
input, and this can kill some routing workloads.

Reinstate the noref path when we hit a cached route in the FIB
nexthops.

With help from Eric Dumazet.

Reported-by: Alexander Duyck
Signed-off-by: David S. Miller
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

David S. Miller
2012-07-27 06:50:39 +0800

24 Jul, 2012

1 commit

13378cad0 ipv4: Change rt->rt_iif encoding. ... Browse Code »

On input packet processing, rt->rt_iif will be zero if we should
use skb->dev->ifindex.

Since we access rt->rt_iif consistently via inet_iif(), that is
the only spot whose interpretation have to adjust.

Signed-off-by: David S. Miller

David S. Miller
2012-07-24 07:36:27 +0800

21 Jul, 2012

4 commits

2860583fe ipv4: Kill rt->fi ... Browse Code »

It's not really needed.

We only grabbed a reference to the fib_info for the sake of fib_info
local metrics.

However, fib_info objects are freed using RCU, as are therefore their
private metrics (if any).

We would have triggered a route cache flush if we eliminated a
reference to a fib_info object in the routing tables.

Therefore, any existing cached routes will first check and see that
they have been invalidated before an errant reference to these
metric values would occur.

Signed-off-by: David S. Miller

David S. Miller
2012-07-21 04:40:07 +0800
9917e1e87 ipv4: Turn rt->rt_route_iif into rt->rt_is_input. ... Browse Code »

That is this value's only use, as a boolean to indicate whether
a route is an input route or not.

So implement it that way, using a u16 gap present in the struct
already.

Signed-off-by: David S. Miller

David S. Miller
2012-07-21 04:40:02 +0800
4fd551d7b ipv4: Kill rt->rt_oif ... Browse Code »

Never actually used.

It was being set on output routes to the original OIF specified in the
flow key used for the lookup.

Adjust the only user, ipmr_rt_fib_lookup(), for greater correctness of
the flowi4_oif and flowi4_iif values, thanks to feedback from Julian
Anastasov.

Signed-off-by: David S. Miller

David S. Miller
2012-07-21 04:38:34 +0800
f8126f1d5 ipv4: Adjust semantics of rt->rt_gateway. ... Browse Code »

In order to allow prefixed routes, we have to adjust how rt_gateway
is set and interpreted.

The new interpretation is:

1) rt_gateway == 0, destination is on-link, nexthop is iph->daddr

2) rt_gateway != 0, destination requires a nexthop gateway

Abstract the fetching of the proper nexthop value using a new
inline helper, rt_nexthop(), as suggested by Joe Perches.

Signed-off-by: David S. Miller
Tested-by: Vijay Subramanian

David S. Miller
2012-07-21 04:31:20 +0800