Eric Lee / smarc-fsl-linux-kernel

19 Nov, 2019

1 commit

ca749bbb1 net/ipv4: fix sysctl max for fib_multipath_hash_policy ... Browse Code »

Commit eec4844fae7c ("proc/sysctl: add shared variables for range
check") did:
- .extra2 = &two,
+ .extra2 = SYSCTL_ONE,
here, which doesn't seem to be intentional, given the changelog.
This patch restores it to the previous, as the value of 2 still makes
sense (used in fib_multipath_hash()).

Fixes: eec4844fae7c ("proc/sysctl: add shared variables for range check")
Cc: Matteo Croce
Signed-off-by: Marcelo Ricardo Leitner
Acked-by: Matteo Croce
Signed-off-by: David S. Miller

Marcelo Ricardo Leitner
2019-11-19 09:25:36 +0800

10 Aug, 2019

1 commit

c04b79b6c tcp: add new tcp_mtu_probe_floor sysctl ... Browse Code »

The current implementation of TCP MTU probing can considerably
underestimate the MTU on lossy connections allowing the MSS to get down to
48. We have found that in almost all of these cases on our networks these
paths can handle much larger MTUs meaning the connections are being
artificially limited. Even though TCP MTU probing can raise the MSS back up
we have seen this not to be the case causing connections to be "stuck" with
an MSS of 48 when heavy loss is present.

Prior to pushing out this change we could not keep TCP MTU probing enabled
b/c of the above reasons. Now with a reasonble floor set we've had it
enabled for the past 6 months.

The new sysctl will still default to TCP_MIN_SND_MSS (48), but gives
administrators the ability to control the floor of MSS probing.

Signed-off-by: Josh Hunt
Signed-off-by: Eric Dumazet
Acked-by: Neal Cardwell
Signed-off-by: David S. Miller

Josh Hunt
2019-08-10 04:03:30 +0800

19 Jul, 2019

1 commit

eec4844fa proc/sysctl: add shared variables for range check ... Browse Code »

In the sysctl code the proc_dointvec_minmax() function is often used to
validate the user supplied value between an allowed range. This
function uses the extra1 and extra2 members from struct ctl_table as
minimum and maximum allowed value.

On sysctl handler declaration, in every source file there are some
readonly variables containing just an integer which address is assigned
to the extra1 and extra2 members, so the sysctl range is enforced.

The special values 0, 1 and INT_MAX are very often used as range
boundary, leading duplication of variables like zero=0, one=1,
int_max=INT_MAX in different source files:

$ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l
248

Add a const int array containing the most commonly used values, some
macros to refer more easily to the correct array member, and use them
instead of creating a local one for every object file.

This is the bloat-o-meter output comparing the old and new binary
compiled with the default Fedora config:

# scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
Data old new delta
sysctl_vals - 12 +12
__kstrtab_sysctl_vals - 12 +12
max 14 10 -4
int_max 16 - -16
one 68 - -68
zero 128 28 -100
Total: Before=20583249, After=20583085, chg -0.00%

[mcroce@redhat.com: tipc: remove two unused variables]
Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
[akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
[arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
[akpm@linux-foundation.org: fix fs/eventpoll.c]
Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.com
Signed-off-by: Matteo Croce
Signed-off-by: Arnd Bergmann
Acked-by: Kees Cook
Reviewed-by: Aaron Tomlin
Cc: Matthew Wilcox
Cc: Stephen Rothwell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matteo Croce
2019-07-19 08:08:07 +0800

23 Jun, 2019

1 commit

438ac8800 net: fastopen: robustness and endianness fixes for SipHash ... Browse Code »

Some changes to the TCP fastopen code to make it more robust
against future changes in the choice of key/cookie size, etc.

- Instead of keeping the SipHash key in an untyped u8[] buffer
and casting it to the right type upon use, use the correct
type directly. This ensures that the key will appear at the
correct alignment if we ever change the way these data
structures are allocated. (Currently, they are only allocated
via kmalloc so they always appear at the correct alignment)

- Use DIV_ROUND_UP when sizing the u64[] array to hold the
cookie, so it is always of sufficient size, even if
TCP_FASTOPEN_COOKIE_MAX is no longer a multiple of 8.

- Drop the 'len' parameter from the tcp_fastopen_reset_cipher()
function, which is no longer used.

- Add endian swabbing when setting the keys and calculating the hash,
to ensure that cookie values are the same for a given key and
source/destination address pair regardless of the endianness of
the server.

Note that none of these are functional changes wrt the current
state of the code, with the exception of the swabbing, which only
affects big endian systems.

Signed-off-by: Ard Biesheuvel
Signed-off-by: David S. Miller

Ard Biesheuvel
2019-06-23 07:30:37 +0800

18 Jun, 2019

2 commits

13091aa30 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Honestly all the conflicts were simple overlapping changes,
nothing really interesting to report.

Signed-off-by: David S. Miller

David S. Miller
2019-06-18 11:20:36 +0800
4fddbf8a9 Merge branch 'tcp-fixes' ... Browse Code »

Eric Dumazet says:

====================
tcp: make sack processing more robust

Jonathan Looney brought to our attention multiple problems
in TCP stack at the sender side.

SACK processing can be abused by malicious peers to either
cause overflows, or increase of memory usage.

First two patches fix the immediate problems.

Since the malicious peers abuse senders by advertizing a very
small MSS in their SYN or SYNACK packet, the last two
patches add a new sysctl so that admins can chose a higher
limit for MSS clamping.
====================

Signed-off-by: David S. Miller

David S. Miller
2019-06-18 01:39:56 +0800

17 Jun, 2019

1 commit

2e05fcae8 tcp: fix compile error if !CONFIG_SYSCTL ... Browse Code »

tcp_tx_skb_cache_key and tcp_rx_skb_cache_key must be available
even if CONFIG_SYSCTL is not set.

Fixes: 0b7d7f6b2208 ("tcp: add tcp_tx_skb_cache sysctl")
Fixes: ede61ca474a0 ("tcp: add tcp_rx_skb_cache sysctl")
Signed-off-by: Eric Dumazet
Reported-by: Willem de Bruijn
Signed-off-by: David S. Miller

Eric Dumazet
2019-06-17 05:15:07 +0800

16 Jun, 2019

1 commit

5f3e2bf00 tcp: add tcp_min_snd_mss sysctl ... Browse Code »

Some TCP peers announce a very small MSS option in their SYN and/or
SYN/ACK messages.

This forces the stack to send packets with a very high network/cpu
overhead.

Linux has enforced a minimal value of 48. Since this value includes
the size of TCP options, and that the options can consume up to 40
bytes, this means that each segment can include only 8 bytes of payload.

In some cases, it can be useful to increase the minimal value
to a saner value.

We still let the default to 48 (TCP_MIN_SND_MSS), for compatibility
reasons.

Note that TCP_MAXSEG socket option enforces a minimal value
of (TCP_MIN_MSS). David Miller increased this minimal value
in commit c39508d6f118 ("tcp: Make TCP_MAXSEG minimum more correct.")
from 64 to 88.

We might in the future merge TCP_MIN_SND_MSS and TCP_MIN_MSS.

CVE-2019-11479 -- tcp mss hardcoded to 48

Signed-off-by: Eric Dumazet
Suggested-by: Jonathan Looney
Acked-by: Neal Cardwell
Cc: Yuchung Cheng
Cc: Tyler Hicks
Cc: Bruce Curtis
Cc: Jonathan Lemon
Signed-off-by: David S. Miller

Eric Dumazet
2019-06-16 09:47:31 +0800

15 Jun, 2019

3 commits

0b7d7f6b2 tcp: add tcp_tx_skb_cache sysctl ... Browse Code »

Feng Tang reported a performance regression after introduction
of per TCP socket tx/rx caches, for TCP over loopback (netperf)

There is high chance the regression is caused by a change on
how well the 32 KB per-thread page (current->task_frag) can
be recycled, and lack of pcp caches for order-3 pages.

I could not reproduce the regression myself, cpus all being
spinning on the mm spinlocks for page allocs/freeing, regardless
of enabling or disabling the per tcp socket caches.

It seems best to disable the feature by default, and let
admins enabling it.

MM layer either needs to provide scalable order-3 pages
allocations, or could attempt a trylock on zone->lock if
the caller only attempts to get a high-order page and is
able to fallback to order-0 ones in case of pressure.

Tests run on a 56 cores host (112 hyper threads)

- 35.49% netperf [kernel.vmlinux] [k] queued_spin_lock_slowpath
- 35.49% queued_spin_lock_slowpath
- 18.18% get_page_from_freelist
- __alloc_pages_nodemask
- 18.18% alloc_pages_current
skb_page_frag_refill
sk_page_frag_refill
tcp_sendmsg_locked
tcp_sendmsg
inet_sendmsg
sock_sendmsg
__sys_sendto
__x64_sys_sendto
do_syscall_64
entry_SYSCALL_64_after_hwframe
__libc_send
+ 17.31% __free_pages_ok
+ 31.43% swapper [kernel.vmlinux] [k] intel_idle
+ 9.12% netperf [kernel.vmlinux] [k] copy_user_enhanced_fast_string
+ 6.53% netserver [kernel.vmlinux] [k] copy_user_enhanced_fast_string
+ 0.69% netserver [kernel.vmlinux] [k] queued_spin_lock_slowpath
+ 0.68% netperf [kernel.vmlinux] [k] skb_release_data
+ 0.52% netperf [kernel.vmlinux] [k] tcp_sendmsg_locked
0.46% netperf [kernel.vmlinux] [k] _raw_spin_lock_irqsave

Fixes: 472c2e07eef0 ("tcp: add one skb cache for tx")
Signed-off-by: Eric Dumazet
Reported-by: Feng Tang
Signed-off-by: David S. Miller

Eric Dumazet
2019-06-15 11:18:28 +0800
ede61ca47 tcp: add tcp_rx_skb_cache sysctl ... Browse Code »

Instead of relying on rps_needed, it is safer to use a separate
static key, since we do not want to enable TCP rx_skb_cache
by default. This feature can cause huge increase of memory
usage on hosts with millions of sockets.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2019-06-15 11:18:28 +0800
363887a2c ipv4: Support multipath hashing on inner IP pkts for GRE tunnel ... Browse Code »

Multipath hash policy value of 0 isn't distributing since the outer IP
dest and src aren't varied eventhough the inner ones are. Since the flow
is on the inner ones in the case of tunneled traffic, hashing on them is
desired.

This is done mainly for IP over GRE, hence only tested for that. But
anything else supported by flow dissection should work.

v2: Use skb_flow_dissect_flow_keys() directly so that other tunneling
can be supported through flow dissection (per Nikolay Aleksandrov).
v3: Remove accidental inclusion of ports in the hash keys and clarify
the documentation (Nikolay Alexandrov).
Signed-off-by: Stephen Suryaputra
Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Stephen Suryaputra
2019-06-15 10:42:35 +0800

31 May, 2019

2 commits

aa1236cdf tcp: add support for optional TFO backup key to net.ipv4.tcp_fastopen_key ... Browse Code »

Add the ability to add a backup TFO key as:

# echo "x-x-x-x,x-x-x-x" > /proc/sys/net/ipv4/tcp_fastopen_key

The key before the comma acks as the primary TFO key and the key after the
comma is the backup TFO key. This change is intended to be backwards
compatible since if only one key is set, userspace will simply read back
that single key as follows:

# echo "x-x-x-x" > /proc/sys/net/ipv4/tcp_fastopen_key
# cat /proc/sys/net/ipv4/tcp_fastopen_key
x-x-x-x

Signed-off-by: Jason Baron
Signed-off-by: Christoph Paasch
Acked-by: Yuchung Cheng
Signed-off-by: David S. Miller

Jason Baron
2019-05-31 04:41:26 +0800
9092a76d3 tcp: add backup TFO key infrastructure ... Browse Code »

We would like to be able to rotate TFO keys while minimizing the number of
client cookies that are rejected. Currently, we have only one key which can
be used to generate and validate cookies, thus if we simply replace this
key clients can easily have cookies rejected upon rotation.

We propose having the ability to have both a primary key and a backup key.
The primary key is used to generate as well as to validate cookies.
The backup is only used to validate cookies. Thus, keys can be rotated as:

1) generate new key
2) add new key as the backup key
3) swap the primary and backup key, thus setting the new key as the primary

We don't simply set the new key as the primary key and move the old key to
the backup slot because the ip may be behind a load balancer and we further
allow for the fact that all machines behind the load balancer will not be
updated simultaneously.

We make use of this infrastructure in subsequent patches.

Suggested-by: Igor Lubashev
Signed-off-by: Jason Baron
Signed-off-by: Christoph Paasch
Acked-by: Yuchung Cheng
Signed-off-by: David S. Miller

Jason Baron
2019-05-31 04:41:26 +0800

26 Apr, 2019

1 commit

8b4483658 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Two easy cases of overlapping changes.

Signed-off-by: David S. Miller

David S. Miller
2019-04-26 11:52:29 +0800

18 Apr, 2019

1 commit

19fad20d1 ipv4: set the tcp_min_rtt_wlen range from 0 to one day ... Browse Code »

There is a UBSAN report as below:
UBSAN: Undefined behaviour in net/ipv4/tcp_input.c:2877:56
signed integer overflow:
2147483647 * 1000 cannot be represented in type 'int'
CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.1.0-rc4-00058-g582549e #1
Call Trace:

dump_stack+0x8c/0xba
ubsan_epilogue+0x11/0x60
handle_overflow+0x12d/0x170
? ttwu_do_wakeup+0x21/0x320
__ubsan_handle_mul_overflow+0x12/0x20
tcp_ack_update_rtt+0x76c/0x780
tcp_clean_rtx_queue+0x499/0x14d0
tcp_ack+0x69e/0x1240
? __wake_up_sync_key+0x2c/0x50
? update_group_capacity+0x50/0x680
tcp_rcv_established+0x4e2/0xe10
tcp_v4_do_rcv+0x22b/0x420
tcp_v4_rcv+0xfe8/0x1190
ip_protocol_deliver_rcu+0x36/0x180
ip_local_deliver+0x15b/0x1a0
ip_rcv+0xac/0xd0
__netif_receive_skb_one_core+0x7f/0xb0
__netif_receive_skb+0x33/0xc0
netif_receive_skb_internal+0x84/0x1c0
napi_gro_receive+0x2a0/0x300
receive_buf+0x3d4/0x2350
? detach_buf_split+0x159/0x390
virtnet_poll+0x198/0x840
? reweight_entity+0x243/0x4b0
net_rx_action+0x25c/0x770
__do_softirq+0x19b/0x66d
irq_exit+0x1eb/0x230
do_IRQ+0x7a/0x150
common_interrupt+0xf/0xf

It can be reproduced by:
echo 2147483647 > /proc/sys/net/ipv4/tcp_min_rtt_wlen

Fixes: f672258391b42 ("tcp: track min RTT using windowed min-filter")
Signed-off-by: ZhangXiaoxu
Signed-off-by: David S. Miller

ZhangXiaoxu
2019-04-18 04:57:11 +0800

22 Mar, 2019

1 commit

9ab948a91 ipv4: Allow amount of dirty memory from fib resizing to be controllable ... Browse Code »

fib_trie implementation calls synchronize_rcu when a certain amount of
pages are dirty from freed entries. The number of pages was determined
experimentally in 2009 (commit c3059477fce2d).

At the current setting, synchronize_rcu is called often -- 51 times in a
second in one test with an average of an 8 msec delay adding a fib entry.
The total impact is a lot of slow down modifying the fib. This is seen
in the output of 'time' - the difference between real time and sys+user.
For example, using 720,022 single path routes and 'ip -batch'[1]:

$ time ./ip -batch ipv4/routes-1-hops
real 0m14.214s
user 0m2.513s
sys 0m6.783s

So roughly 35% of the actual time to install the routes is from the ip
command getting scheduled out, most notably due to synchronize_rcu (this
is observed using 'perf sched timehist').

This patch makes the amount of dirty memory configurable between 64k where
the synchronize_rcu is called often (small, low end systems that are memory
sensitive) to 64M where synchronize_rcu is called rarely during a large
FIB change (for high end systems with lots of memory). The default is 512kB
which corresponds to the current setting of 128 pages with a 4kB page size.

As an example, at 16MB the worst interval shows 4 calls to synchronize_rcu
in a second blocking for up to 30 msec in a single instance, and a total
of almost 100 msec across the 4 calls in the second. The trade off is
allowing FIB entries to consume more memory in a given time window but
but with much better fib insertion rates (~30% increase in prefixes/sec).
With this patch and net.ipv4.fib_sync_mem set to 16MB, the same batch
file runs in:

$ time ./ip -batch ipv4/routes-1-hops
real 0m9.692s
user 0m2.491s
sys 0m6.769s

So the dead time is reduced to about 1/2 second or
Signed-off-by: David S. Miller

David Ahern
2019-03-22 04:29:53 +0800

08 Nov, 2018

1 commit

6897445fb net: provide a sysctl raw_l3mdev_accept for raw socket lookup with VRFs ... Browse Code »

Add a sysctl raw_l3mdev_accept to control raw socket lookup in a manner
similar to use of tcp_l3mdev_accept for stream and of udp_l3mdev_accept
for datagram sockets. Have this default to enabled for reasons of
backwards compatibility. This is so as to specify the output device
with cmsg and IP_PKTINFO, but using a socket not bound to the
corresponding VRF. This allows e.g. older ping implementations to be
run with specifying the device but without executing it in the VRF.
If the option is disabled, packets received in a VRF context are only
handled by a raw socket bound to the VRF, and correspondingly packets
in the default VRF are only handled by a socket not bound to any VRF.

Signed-off-by: Mike Manning
Reviewed-by: David Ahern
Tested-by: David Ahern
Signed-off-by: David S. Miller

Mike Manning
2018-11-08 08:12:38 +0800

27 Sep, 2018

1 commit

d4ce58082 net-tcp: /proc/sys/net/ipv4/tcp_probe_interval is a u32 not int ... Browse Code »

(fix documentation and sysctl access to treat it as such)

Tested:
# zcat /proc/config.gz | egrep ^CONFIG_HZ
CONFIG_HZ_1000=y
CONFIG_HZ=1000
# echo $[(1<<
Signed-off-by: David S. Miller

Maciej Żenczykowski
2018-09-27 11:33:21 +0800

02 Aug, 2018

2 commits

d18c5d199 net: ipv4: Notify about changes to ip_forward_update_priority ... Browse Code »

Drivers may make offloading decision based on whether
ip_forward_update_priority is enabled or not. Therefore distribute
netevent notifications to give them a chance to react to a change.

Signed-off-by: Petr Machata
Reviewed-by: Ido Schimmel
Signed-off-by: David S. Miller

Petr Machata
2018-08-02 00:52:30 +0800
432e05d32 net: ipv4: Control SKB reprioritization after forwarding ... Browse Code »

After IPv4 packets are forwarded, the priority of the corresponding SKB
is updated according to the TOS field of IPv4 header. This overrides any
prioritization done earlier by e.g. an skbedit action or ingress-qos-map
defined at a vlan device.

Such overriding may not always be desirable. Even if the packet ends up
being routed, which implies this is an L3 network node, an administrator
may wish to preserve whatever prioritization was done earlier on in the
pipeline.

Therefore introduce a sysctl that controls this behavior. Keep the
default value at 1 to maintain backward-compatible behavior.

Signed-off-by: Petr Machata
Reviewed-by: Ido Schimmel
Signed-off-by: David S. Miller

Petr Machata
2018-08-02 00:52:30 +0800

06 Jul, 2018

1 commit

70ba5b6db ipv4: Return EINVAL when ping_group_range sysctl doesn't map to user ns ... Browse Code »

The low and high values of the net.ipv4.ping_group_range sysctl were
being silently forced to the default disabled state when a write to the
sysctl contained GIDs that didn't map to the associated user namespace.
Confusingly, the sysctl's write operation would return success and then
a subsequent read of the sysctl would indicate that the low and high
values are the overflowgid.

This patch changes the behavior by clearly returning an error when the
sysctl write operation receives a GID range that doesn't map to the
associated user namespace. In such a situation, the previous value of
the sysctl is preserved and that range will be returned in a subsequent
read of the sysctl.

Signed-off-by: Tyler Hicks
Signed-off-by: David S. Miller

Tyler Hicks
2018-07-06 10:51:18 +0800

30 Jun, 2018

1 commit

c860e997e tcp: fix Fast Open key endianness ... Browse Code »

Fast Open key could be stored in different endian based on the CPU.
Previously hosts in different endianness in a server farm using
the same key config (sysctl value) would produce different cookies.
This patch fixes it by always storing it as little endian to keep
same API for LE hosts.

Reported-by: Daniele Iamartino
Signed-off-by: Yuchung Cheng
Signed-off-by: Eric Dumazet
Signed-off-by: Neal Cardwell
Signed-off-by: David S. Miller

Yuchung Cheng
2018-06-30 17:40:46 +0800

05 Jun, 2018

1 commit

79e9fed46 net-tcp: extend tcp_tw_reuse sysctl to enable loopback only optimization ... Browse Code »

This changes the /proc/sys/net/ipv4/tcp_tw_reuse from a boolean
to an integer.

It now takes the values 0, 1 and 2, where 0 and 1 behave as before,
while 2 enables timewait socket reuse only for sockets that we can
prove are loopback connections:
ie. bound to 'lo' interface or where one of source or destination
IPs is 127.0.0.0/8, ::ffff:127.0.0.0/104 or ::1.

This enables quicker reuse of ephemeral ports for loopback connections
- where tcp_tw_reuse is 100% safe from a protocol perspective
(this assumes no artificially induced packet loss on 'lo').

This also makes estblishing many loopback connections *much* faster
(allocating ports out of the first half of the ephemeral port range
is significantly faster, then allocating from the second half)

Without this change in a 32K ephemeral port space my sample program
(it just establishes and closes [::1]:ephemeral -> [::1]:server_port
connections in a tight loop) fails after 32765 connections in 24 seconds.
With it enabled 50000 connections only take 4.7 seconds.

This is particularly problematic for IPv6 where we only have one local
address and cannot play tricks with varying source IP from 127.0.0.0/8
pool.

Signed-off-by: Maciej Żenczykowski
Cc: Neal Cardwell
Cc: Yuchung Cheng
Cc: Wei Wang
Change-Id: I0377961749979d0301b7b62871a32a4b34b654e1
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Maciej Żenczykowski
2018-06-05 05:13:35 +0800

18 May, 2018

2 commits

9c21d2fc4 tcp: add tcp_comp_sack_nr sysctl ... Browse Code »

This per netns sysctl allows for TCP SACK compression fine-tuning.

This limits number of SACK that can be compressed.
Using 0 disables SACK compression.

Signed-off-by: Eric Dumazet
Acked-by: Neal Cardwell
Signed-off-by: David S. Miller

Eric Dumazet
2018-05-18 23:40:27 +0800
6d82aa242 tcp: add tcp_comp_sack_delay_ns sysctl ... Browse Code »

This per netns sysctl allows for TCP SACK compression fine-tuning.

Its default value is 1,000,000, or 1 ms to meet TSO autosizing period.

Signed-off-by: Eric Dumazet
Acked-by: Neal Cardwell
Signed-off-by: David S. Miller

Eric Dumazet
2018-05-18 23:40:27 +0800

28 Mar, 2018

1 commit

2f635ceeb net: Drop pernet_operations::async ... Browse Code »

Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.

Signed-off-by: Kirill Tkhai
Signed-off-by: David S. Miller

Kirill Tkhai
2018-03-28 01:18:09 +0800

17 Mar, 2018

1 commit

1e8029515 udp: Move the udp sysctl to namespace. ... Browse Code »

This patch moves the udp_rmem_min, udp_wmem_min
to namespace and init the udp_l3mdev_accept explicitly.

The udp_rmem_min/udp_wmem_min affect udp rx/tx queue,
with this patch namespaces can set them differently.

Signed-off-by: Tonghao Zhang
Signed-off-by: David S. Miller

Tonghao Zhang
2018-03-17 00:03:30 +0800

05 Mar, 2018

1 commit

3192dac64 net: Rename NETEVENT_MULTIPATH_HASH_UPDATE ... Browse Code »

Rename NETEVENT_MULTIPATH_HASH_UPDATE to
NETEVENT_IPV4_MPATH_HASH_UPDATE to denote it relates to a change
in the IPv4 hash policy.

Signed-off-by: David Ahern
Reviewed-by: Ido Schimmel
Reviewed-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

David Ahern
2018-03-05 02:04:22 +0800

13 Feb, 2018

1 commit

22769a2a6 net: Convert ipv4_sysctl_ops ... Browse Code »

These pernet_operations create and destroy sysctl,
which are not touched by anybody else.

Signed-off-by: Kirill Tkhai
Acked-by: Andrei Vagin
Signed-off-by: David S. Miller

Kirill Tkhai
2018-02-13 23:36:09 +0800

15 Nov, 2017

1 commit

6670e1524 tcp: Namespace-ify sysctl_tcp_default_congestion_control ... Browse Code »

Make default TCP default congestion control to a per namespace
value. This changes default congestion control to a pointer to congestion ops
(rather than implicit as first element of available lsit).

The congestion control setting of new namespaces is inherited
from the current setting of the root namespace.

Signed-off-by: Stephen Hemminger
Reviewed-by: Eric Dumazet
Signed-off-by: David S. Miller

Stephen Hemminger
2017-11-15 13:09:52 +0800

10 Nov, 2017

1 commit

356d1833b tcp: Namespace-ify sysctl_tcp_rmem and sysctl_tcp_wmem ... Browse Code »

Note that when a new netns is created, it inherits its
sysctl_tcp_rmem and sysctl_tcp_wmem from initial netns.

This change is needed so that we can refine TCP rcvbuf autotuning,
to take RTT into consideration.

Signed-off-by: Eric Dumazet
Cc: Wei Wang
Signed-off-by: David S. Miller

Eric Dumazet
2017-11-10 13:34:58 +0800

04 Nov, 2017

1 commit

2a171788b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Files removed in 'net-next' had their license header updated
in 'net'. We take the remove from 'net-next'.

Signed-off-by: David S. Miller

David S. Miller
2017-11-04 08:26:51 +0800

03 Nov, 2017

1 commit

3ae6ec082 ipv4: Send a netevent whenever multipath hash policy is changed ... Browse Code »

Devices performing IPv4 forwarding need to update their multipath hash
policy whenever it is changed.

Inform these devices by generating a netevent.

Signed-off-by: Ido Schimmel
Reviewed-by: Petr Machata
Signed-off-by: Jiri Pirko
Acked-by: David Ahern
Signed-off-by: David S. Miller

Ido Schimmel
2017-11-03 14:40:41 +0800

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800

28 Oct, 2017

6 commits

c26e91f8b tcp: Namespace-ify sysctl_tcp_pacing_ca_ratio ... Browse Code »

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2017-10-28 18:24:39 +0800
23a7102a2 tcp: Namespace-ify sysctl_tcp_pacing_ss_ratio ... Browse Code »

Also remove an obsolete comment about TCP pacing.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2017-10-28 18:24:39 +0800
4170ba6b5 tcp: Namespace-ify sysctl_tcp_invalid_ratelimit ... Browse Code »

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2017-10-28 18:24:39 +0800
790f00e19 tcp: Namespace-ify sysctl_tcp_autocorking ... Browse Code »

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2017-10-28 18:24:39 +0800
bd2397042 tcp: Namespace-ify sysctl_tcp_min_rtt_wlen ... Browse Code »

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2017-10-28 18:24:39 +0800
26e9596e5 tcp: Namespace-ify sysctl_tcp_min_tso_segs ... Browse Code »

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2017-10-28 18:24:38 +0800