23 Sep, 2022

1 commit

  • commit e22aa14866684f77b4f6b6cae98539e520ddb731 upstream.

    If we set XFRM security policy by calling setsockopt with option
    IPV6_XFRM_POLICY, the policy will be stored in 'sock_policy' in 'sock'
    struct. However tcp_v6_send_response doesn't look up dst_entry with the
    actual socket but looks up with tcp control socket. This may cause a
    problem that a RST packet is sent without ESP encryption & peer's TCP
    socket can't receive it.
    This patch will make the function look up dest_entry with actual socket,
    if the socket has XFRM policy(sock_policy), so that the TCP response
    packet via this function can be encrypted, & aligned on the encrypted
    TCP socket.

    Tested: We encountered this problem when a TCP socket which is encrypted
    in ESP transport mode encryption, receives challenge ACK at SYN_SENT
    state. After receiving challenge ACK, TCP needs to send RST to
    establish the socket at next SYN try. But the RST was not encrypted &
    peer TCP socket still remains on ESTABLISHED state.
    So we verified this with test step as below.
    [Test step]
    1. Making a TCP state mismatch between client(IDLE) & server(ESTABLISHED).
    2. Client tries a new connection on the same TCP ports(src & dst).
    3. Server will return challenge ACK instead of SYN,ACK.
    4. Client will send RST to server to clear the SOCKET.
    5. Client will retransmit SYN to server on the same TCP ports.
    [Expected result]
    The TCP connection should be established.

    Cc: Maciej Żenczykowski
    Cc: Eric Dumazet
    Cc: Steffen Klassert
    Cc: Sehee Lee
    Signed-off-by: Sewook Seo
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    sewookseo
     

15 Sep, 2022

3 commits

  • [ Upstream commit 686dc2db2a0fdc1d34b424ec2c0a735becd8d62b ]

    Fix a bug reported and analyzed by Nagaraj Arankal, where the handling
    of a spurious non-SACK RTO could cause a connection to fail to clear
    retrans_stamp, causing a later RTO to very prematurely time out the
    connection with ETIMEDOUT.

    Here is the buggy scenario, expanding upon Nagaraj Arankal's excellent
    report:

    (*1) Send one data packet on a non-SACK connection

    (*2) Because no ACK packet is received, the packet is retransmitted
    and we enter CA_Loss; but this retransmission is spurious.

    (*3) The ACK for the original data is received. The transmitted packet
    is acknowledged. The TCP timestamp is before the retrans_stamp,
    so tcp_may_undo() returns true, and tcp_try_undo_loss() returns
    true without changing state to Open (because tcp_is_sack() is
    false), and tcp_process_loss() returns without calling
    tcp_try_undo_recovery(). Normally after undoing a CA_Loss
    episode, tcp_fastretrans_alert() would see that the connection
    has returned to CA_Open and fall through and call
    tcp_try_to_open(), which would set retrans_stamp to 0. However,
    for non-SACK connections we hold the connection in CA_Loss, so do
    not fall through to call tcp_try_to_open() and do not set
    retrans_stamp to 0. So retrans_stamp is (erroneously) still
    non-zero.

    At this point the first "retransmission event" has passed and
    been recovered from. Any future retransmission is a completely
    new "event". However, retrans_stamp is erroneously still
    set. (And we are still in CA_Loss, which is correct.)

    (*4) After 16 minutes (to correspond with tcp_retries2=15), a new data
    packet is sent. Note: No data is transmitted between (*3) and
    (*4) and we disabled keep alives.

    The socket's timeout SHOULD be calculated from this point in
    time, but instead it's calculated from the prior "event" 16
    minutes ago (step (*2)).

    (*5) Because no ACK packet is received, the packet is retransmitted.

    (*6) At the time of the 2nd retransmission, the socket returns
    ETIMEDOUT, prematurely, because retrans_stamp is (erroneously)
    too far in the past (set at the time of (*2)).

    This commit fixes this bug by ensuring that we reuse in
    tcp_try_undo_loss() the same careful logic for non-SACK connections
    that we have in tcp_try_undo_recovery(). To avoid duplicating logic,
    we factor out that logic into a new
    tcp_is_non_sack_preventing_reopen() helper and call that helper from
    both undo functions.

    Fixes: da34ac7626b5 ("tcp: only undo on partial ACKs in CA_Loss")
    Reported-by: Nagaraj Arankal
    Link: https://lore.kernel.org/all/SJ0PR84MB1847BE6C24D274C46A1B9B0EB27A9@SJ0PR84MB1847.NAMPRD84.PROD.OUTLOOK.COM/
    Signed-off-by: Neal Cardwell
    Signed-off-by: Yuchung Cheng
    Reviewed-by: Eric Dumazet
    Link: https://lore.kernel.org/r/20220903121023.866900-1-ncardwell.kernel@gmail.com
    Signed-off-by: Paolo Abeni
    Signed-off-by: Sasha Levin

    Neal Cardwell
     
  • [ Upstream commit 3261400639463a853ba2b3be8bd009c2a8089775 ]

    We got a recent syzbot report [1] showing a possible misuse
    of pfmemalloc page status in TCP zerocopy paths.

    Indeed, for pages coming from user space or other layers,
    using page_is_pfmemalloc() is moot, and possibly could give
    false positives.

    There has been attempts to make page_is_pfmemalloc() more robust,
    but not using it in the first place in this context is probably better,
    removing cpu cycles.

    Note to stable teams :

    You need to backport 84ce071e38a6 ("net: introduce
    __skb_fill_page_desc_noacc") as a prereq.

    Race is more probable after commit c07aea3ef4d4
    ("mm: add a signature in struct page") because page_is_pfmemalloc()
    is now using low order bit from page->lru.next, which can change
    more often than page->index.

    Low order bit should never be set for lru.next (when used as an anchor
    in LRU list), so KCSAN report is mostly a false positive.

    Backporting to older kernel versions seems not necessary.

    [1]
    BUG: KCSAN: data-race in lru_add_fn / tcp_build_frag

    write to 0xffffea0004a1d2c8 of 8 bytes by task 18600 on cpu 0:
    __list_add include/linux/list.h:73 [inline]
    list_add include/linux/list.h:88 [inline]
    lruvec_add_folio include/linux/mm_inline.h:105 [inline]
    lru_add_fn+0x440/0x520 mm/swap.c:228
    folio_batch_move_lru+0x1e1/0x2a0 mm/swap.c:246
    folio_batch_add_and_move mm/swap.c:263 [inline]
    folio_add_lru+0xf1/0x140 mm/swap.c:490
    filemap_add_folio+0xf8/0x150 mm/filemap.c:948
    __filemap_get_folio+0x510/0x6d0 mm/filemap.c:1981
    pagecache_get_page+0x26/0x190 mm/folio-compat.c:104
    grab_cache_page_write_begin+0x2a/0x30 mm/folio-compat.c:116
    ext4_da_write_begin+0x2dd/0x5f0 fs/ext4/inode.c:2988
    generic_perform_write+0x1d4/0x3f0 mm/filemap.c:3738
    ext4_buffered_write_iter+0x235/0x3e0 fs/ext4/file.c:270
    ext4_file_write_iter+0x2e3/0x1210
    call_write_iter include/linux/fs.h:2187 [inline]
    new_sync_write fs/read_write.c:491 [inline]
    vfs_write+0x468/0x760 fs/read_write.c:578
    ksys_write+0xe8/0x1a0 fs/read_write.c:631
    __do_sys_write fs/read_write.c:643 [inline]
    __se_sys_write fs/read_write.c:640 [inline]
    __x64_sys_write+0x3e/0x50 fs/read_write.c:640
    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
    entry_SYSCALL_64_after_hwframe+0x63/0xcd

    read to 0xffffea0004a1d2c8 of 8 bytes by task 18611 on cpu 1:
    page_is_pfmemalloc include/linux/mm.h:1740 [inline]
    __skb_fill_page_desc include/linux/skbuff.h:2422 [inline]
    skb_fill_page_desc include/linux/skbuff.h:2443 [inline]
    tcp_build_frag+0x613/0xb20 net/ipv4/tcp.c:1018
    do_tcp_sendpages+0x3e8/0xaf0 net/ipv4/tcp.c:1075
    tcp_sendpage_locked net/ipv4/tcp.c:1140 [inline]
    tcp_sendpage+0x89/0xb0 net/ipv4/tcp.c:1150
    inet_sendpage+0x7f/0xc0 net/ipv4/af_inet.c:833
    kernel_sendpage+0x184/0x300 net/socket.c:3561
    sock_sendpage+0x5a/0x70 net/socket.c:1054
    pipe_to_sendpage+0x128/0x160 fs/splice.c:361
    splice_from_pipe_feed fs/splice.c:415 [inline]
    __splice_from_pipe+0x222/0x4d0 fs/splice.c:559
    splice_from_pipe fs/splice.c:594 [inline]
    generic_splice_sendpage+0x89/0xc0 fs/splice.c:743
    do_splice_from fs/splice.c:764 [inline]
    direct_splice_actor+0x80/0xa0 fs/splice.c:931
    splice_direct_to_actor+0x305/0x620 fs/splice.c:886
    do_splice_direct+0xfb/0x180 fs/splice.c:974
    do_sendfile+0x3bf/0x910 fs/read_write.c:1249
    __do_sys_sendfile64 fs/read_write.c:1317 [inline]
    __se_sys_sendfile64 fs/read_write.c:1303 [inline]
    __x64_sys_sendfile64+0x10c/0x150 fs/read_write.c:1303
    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
    entry_SYSCALL_64_after_hwframe+0x63/0xcd

    value changed: 0x0000000000000000 -> 0xffffea0004a1d288

    Reported by Kernel Concurrency Sanitizer on:
    CPU: 1 PID: 18611 Comm: syz-executor.4 Not tainted 6.0.0-rc2-syzkaller-00248-ge022620b5d05-dirty #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/22/2022

    Fixes: c07aea3ef4d4 ("mm: add a signature in struct page")
    Reported-by: syzbot
    Signed-off-by: Eric Dumazet
    Cc: Shakeel Butt
    Reviewed-by: Shakeel Butt
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Eric Dumazet
     
  • [ Upstream commit ac56a0b48da86fd1b4389632fb7c4c8a5d86eefa ]

    Because rxrpc pretends to be a tunnel on top of a UDP/UDP6 socket, allowing
    it to siphon off UDP packets early in the handling of received UDP packets
    thereby avoiding the packet going through the UDP receive queue, it doesn't
    get ICMP packets through the UDP ->sk_error_report() callback. In fact, it
    doesn't appear that there's any usable option for getting hold of ICMP
    packets.

    Fix this by adding a new UDP encap hook to distribute error messages for
    UDP tunnels. If the hook is set, then the tunnel driver will be able to
    see ICMP packets. The hook provides the offset into the packet of the UDP
    header of the original packet that caused the notification.

    An alternative would be to call the ->error_handler() hook - but that
    requires that the skbuff be cloned (as ip_icmp_error() or ipv6_cmp_error()
    do, though isn't really necessary or desirable in rxrpc's case is we want
    to parse them there and then, not queue them).

    Changes
    =======
    ver #3)
    - Fixed an uninitialised variable.

    ver #2)
    - Fixed some missing CONFIG_AF_RXRPC_IPV6 conditionals.

    Fixes: 5271953cad31 ("rxrpc: Use the UDP encap_rcv hook")
    Signed-off-by: David Howells
    Signed-off-by: Sasha Levin

    David Howells
     

08 Sep, 2022

2 commits

  • commit eb55dc09b5dd040232d5de32812cc83001a23da6 upstream.

    __mkroute_input() uses fib_validate_source() to trigger an icmp redirect.
    My understanding is that fib_validate_source() is used to know if the src
    address and the gateway address are on the same link. For that,
    fib_validate_source() returns 1 (same link) or 0 (not the same network).
    __mkroute_input() is the only user of these positive values, all other
    callers only look if the returned value is negative.

    Since the below patch, fib_validate_source() didn't return anymore 1 when
    both addresses are on the same network, because the route lookup returns
    RT_SCOPE_LINK instead of RT_SCOPE_HOST. But this is, in fact, right.
    Let's adapat the test to return 1 again when both addresses are on the same
    link.

    CC: stable@vger.kernel.org
    Fixes: 747c14307214 ("ip: fix dflt addr selection for connected nexthop")
    Reported-by: kernel test robot
    Reported-by: Heng Qi
    Signed-off-by: Nicolas Dichtel
    Reviewed-by: David Ahern
    Link: https://lore.kernel.org/r/20220829100121.3821-1-nicolas.dichtel@6wind.com
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Nicolas Dichtel
     
  • [ Upstream commit 8c70521238b7863c2af607e20bcba20f974c969b ]

    challenge_timestamp can be read an written by concurrent threads.

    This was expected, but we need to annotate the race to avoid potential issues.

    Following patch moves challenge_timestamp and challenge_count
    to per-netns storage to provide better isolation.

    Fixes: 354e4aa391ed ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation")
    Reported-by: syzbot
    Signed-off-by: Eric Dumazet
    Acked-by: Neal Cardwell
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Sasha Levin

    Eric Dumazet
     

31 Aug, 2022

5 commits

  • [ Upstream commit a5612ca10d1aa05624ebe72633e0c8c792970833 ]

    While reading sysctl_devconf_inherit_init_net, it can be changed
    concurrently. Thus, we need to add READ_ONCE() to its readers.

    Fixes: 856c395cfa63 ("net: introduce a knob to control whether to inherit devconf config")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 657b991afb89d25fe6c4783b1b75a8ad4563670d ]

    While reading sysctl_max_skb_frags, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its readers.

    Fixes: 5f74f82ea34c ("net:Add sysctl_max_skb_frags")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 04d8825c30b718781197c8f07b1915a11bfb8685 ]

    the tcp_skb_entail() helper is actually skb_entail(), renamed
    to provide proper scope.

    The two helper will be used by the next patch.

    RFC -> v1:
    - rename skb_entail to tcp_skb_entail (Eric)

    Acked-by: Mat Martineau
    Signed-off-by: Paolo Abeni
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Paolo Abeni
     
  • [ Upstream commit 7de6d09f51917c829af2b835aba8bb5040f8e86a ]

    While reading sysctl_optmem_max, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its readers.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 1227c1771dd2ad44318aa3ab9e3a293b3f34ff2a ]

    While reading sysctl_[rw]mem_(max|default), they can be changed
    concurrently. Thus, we need to add READ_ONCE() to its readers.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     

17 Aug, 2022

4 commits

  • commit c4ee118561a0f74442439b7b5b486db1ac1ddfeb upstream.

    sk_forced_mem_schedule() has a bug similar to ones fixed
    in commit 7c80b038d23e ("net: fix sk_wmem_schedule() and
    sk_rmem_schedule() errors")

    While this bug has little chance to trigger in old kernels,
    we need to fix it before the following patch.

    Fixes: d83769a580f1 ("tcp: fix possible deadlock in tcp_send_fin()")
    Signed-off-by: Eric Dumazet
    Acked-by: Soheil Hassas Yeganeh
    Reviewed-by: Shakeel Butt
    Reviewed-by: Wei Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 5d368f03280d3678433a7f119efe15dfbbb87bc8 ]

    INET6_MATCH() runs without holding a lock on the socket.

    We probably need to annotate most reads.

    This patch makes INET6_MATCH() an inline function
    to ease our changes.

    v2: inline function only defined if IS_ENABLED(CONFIG_IPV6)
    Change the name to inet6_match(), this is no longer a macro.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Eric Dumazet
     
  • [ Upstream commit 4915d50e300e96929d2462041d6f6c6f061167fd ]

    INET_MATCH() runs without holding a lock on the socket.

    We probably need to annotate most reads.

    This patch makes INET_MATCH() an inline function
    to ease our changes.

    v2:

    We remove the 32bit version of it, as modern compilers
    should generate the same code really, no need to
    try to be smarter.

    Also make 'struct net *net' the first argument.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Eric Dumazet
     
  • [ Upstream commit 536a6c8e05f95e3d1118c40ae8b3022ee2d05d52 ]

    current code of __tcp_retransmit_skb only check TCP_SKB_CB(skb)->seq
    in send window, and TCP_SKB_CB(skb)->seq_end maybe out of send window.
    If receiver has shrunk his window, and skb is out of new window, it
    should retransmit a smaller portion of the payload.

    test packetdrill script:
    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
    +0 fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
    +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0

    +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
    +0 > S 0:0(0) win 65535
    +.05 < S. 0:0(0) ack 1 win 6000
    +0 > . 1:1(0) ack 1

    +0 write(3, ..., 10000) = 10000

    +0 > . 1:2001(2000) ack 1 win 65535
    +0 > . 2001:4001(2000) ack 1 win 65535
    +0 > . 4001:6001(2000) ack 1 win 65535

    +.05 < . 1:1(0) ack 4001 win 1001

    and tcpdump show:
    192.168.226.67.55 > 192.0.2.1.8080: Flags [.], seq 1:2001, ack 1, win 65535, length 2000
    192.168.226.67.55 > 192.0.2.1.8080: Flags [.], seq 2001:4001, ack 1, win 65535, length 2000
    192.168.226.67.55 > 192.0.2.1.8080: Flags [P.], seq 4001:5001, ack 1, win 65535, length 1000
    192.168.226.67.55 > 192.0.2.1.8080: Flags [.], seq 5001:6001, ack 1, win 65535, length 1000
    192.0.2.1.8080 > 192.168.226.67.55: Flags [.], ack 4001, win 1001, length 0
    192.168.226.67.55 > 192.0.2.1.8080: Flags [.], seq 5001:6001, ack 1, win 65535, length 1000
    192.168.226.67.55 > 192.0.2.1.8080: Flags [P.], seq 4001:5001, ack 1, win 65535, length 1000

    when cient retract window to 1001, send window is [4001,5002],
    but TLP send 5001-6001 packet which is out of send window.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Yonglong Li
    Signed-off-by: Eric Dumazet
    Link: https://lore.kernel.org/r/1657532838-20200-1-git-send-email-liyonglong@chinatelecom.cn
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Sasha Levin

    Yonglong Li
     

03 Aug, 2022

21 commits

  • [ Upstream commit 96b9bd8c6d125490f9adfb57d387ef81a55a103e ]

    While reading sysctl_fib_notify_on_flag_change, it can be changed
    concurrently. Thus, we need to add READ_ONCE() to its readers.

    Fixes: 680aea08e78c ("net: ipv4: Emit notification when fib hardware flags are changed")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 870e3a634b6a6cb1543b359007aca73fe6a03ac5 ]

    While reading sysctl_tcp_reflect_tos, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its readers.

    Fixes: ac8f1710c12b ("tcp: reflect tos value received in SYN to the socket")
    Signed-off-by: Kuniyuki Iwashima
    Acked-by: Wei Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 79f55473bfc8ac51bd6572929a679eeb4da22251 ]

    While reading sysctl_tcp_comp_sack_nr, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its reader.

    Fixes: 9c21d2fc41c0 ("tcp: add tcp_comp_sack_nr sysctl")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 22396941a7f343d704738360f9ef0e6576489d43 ]

    While reading sysctl_tcp_comp_sack_slack_ns, it can be changed
    concurrently. Thus, we need to add READ_ONCE() to its reader.

    Fixes: a70437cc09a1 ("tcp: add hrtimer slack to sack compression")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 4866b2b0f7672b6d760c4b8ece6fb56f965dcc8a ]

    While reading sysctl_tcp_comp_sack_delay_ns, it can be changed
    concurrently. Thus, we need to add READ_ONCE() to its reader.

    Fixes: 6d82aa242092 ("tcp: add tcp_comp_sack_delay_ns sysctl")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 02739545951ad4c1215160db7fbf9b7a918d3c0b ]

    While reading these sysctl variables, they can be changed concurrently.
    Thus, we need to add READ_ONCE() to their readers.

    - .sysctl_rmem
    - .sysctl_rwmem
    - .sysctl_rmem_offset
    - .sysctl_wmem_offset
    - sysctl_tcp_rmem[1, 2]
    - sysctl_tcp_wmem[1, 2]
    - sysctl_decnet_rmem[1]
    - sysctl_decnet_wmem[1]
    - sysctl_tipc_rmem[1]

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 59bf6c65a09fff74215517aecffbbdcd67df76e3 ]

    While reading sysctl_tcp_pacing_(ss|ca)_ratio, they can be changed
    concurrently. Thus, we need to add READ_ONCE() to their readers.

    Fixes: 43e122b014c9 ("tcp: refine pacing rate determination")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 2afdbe7b8de84c28e219073a6661080e1b3ded48 ]

    While reading sysctl_tcp_invalid_ratelimit, it can be changed
    concurrently. Thus, we need to add READ_ONCE() to its reader.

    Fixes: 032ee4236954 ("tcp: helpers to mitigate ACK loops by rate-limiting out-of-window dupacks")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 85225e6f0a76e6745bc841c9f25169c509b573d8 ]

    While reading sysctl_tcp_autocorking, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its reader.

    Fixes: f54b311142a9 ("tcp: auto corking")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 1330ffacd05fc9ac4159d19286ce119e22450ed2 ]

    While reading sysctl_tcp_min_rtt_wlen, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its reader.

    Fixes: f672258391b4 ("tcp: track min RTT using windowed min-filter")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit e0bb4ab9dfddd872622239f49fb2bd403b70853b ]

    While reading sysctl_tcp_min_tso_segs, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its reader.

    Fixes: 95bd09eb2750 ("tcp: TSO packets automatic sizing")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 8ebcc62c738f68688ee7c6fec2efe5bc6d3d7e60 ]

    While reading sysctl_igmp_qrv, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its readers.

    This test can be packed into a helper, so such changes will be in the
    follow-up series after net is merged into net-next.

    qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv);

    Fixes: a9fe8e29945d ("ipv4: implement igmp_qrv sysctl to tune igmp robustness variable")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • commit db3815a2fa691da145cfbe834584f31ad75df9ff upstream.

    While reading sysctl_tcp_challenge_ack_limit, it can be changed
    concurrently. Thus, we need to add READ_ONCE() to its reader.

    Fixes: 282f23c6ee34 ("tcp: implement RFC 5961 3.2")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kuniyuki Iwashima
     
  • commit 9fb90193fbd66b4c5409ef729fd081861f8b6351 upstream.

    While reading sysctl_tcp_limit_output_bytes, it can be changed
    concurrently. Thus, we need to add READ_ONCE() to its reader.

    Fixes: 46d3ceabd8d9 ("tcp: TCP Small Queues")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kuniyuki Iwashima
     
  • commit 780476488844e070580bfc9e3bc7832ec1cea883 upstream.

    While reading sysctl_tcp_moderate_rcvbuf, it can be changed
    concurrently. Thus, we need to add READ_ONCE() to its readers.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kuniyuki Iwashima
     
  • commit 4d8f24eeedc58d5f87b650ddda73c16e8ba56559 upstream.

    This reverts commit 4a41f453bedfd5e9cd040bad509d9da49feb3e2c.

    This to-be-reverted commit was meant to apply a stricter rule for the
    stack to enter pingpong mode. However, the condition used to check for
    interactive session "before(tp->lsndtime, icsk->icsk_ack.lrcvtime)" is
    jiffy based and might be too coarse, which delays the stack entering
    pingpong mode.
    We revert this patch so that we no longer use the above condition to
    determine interactive session, and also reduce pingpong threshold to 1.

    Fixes: 4a41f453bedf ("tcp: change pingpong threshold to 3")
    Reported-by: LemmyHuang
    Suggested-by: Neal Cardwell
    Signed-off-by: Wei Wang
    Acked-by: Neal Cardwell
    Reviewed-by: Eric Dumazet
    Link: https://lore.kernel.org/r/20220721204404.388396-1-weiwan@google.com
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Wei Wang
     
  • commit ab1ba21b523ab496b1a4a8e396333b24b0a18f9a upstream.

    While reading sysctl_tcp_no_ssthresh_metrics_save, it can be changed
    concurrently. Thus, we need to add READ_ONCE() to its readers.

    Fixes: 65e6d90168f3 ("net-tcp: Disable TCP ssthresh metrics cache by default")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kuniyuki Iwashima
     
  • commit 8499a2454d9e8a55ce616ede9f9580f36fd5b0f3 upstream.

    While reading sysctl_tcp_nometrics_save, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its reader.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kuniyuki Iwashima
     
  • commit 706c6202a3589f290e1ef9be0584a8f4a3cc0507 upstream.

    While reading sysctl_tcp_frto, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its reader.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kuniyuki Iwashima
     
  • commit 02ca527ac5581cf56749db9fd03d854e842253dd upstream.

    While reading sysctl_tcp_app_win, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its reader.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kuniyuki Iwashima
     
  • commit 58ebb1c8b35a8ef38cd6927431e0fa7b173a632d upstream.

    While reading sysctl_tcp_dsack, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its readers.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kuniyuki Iwashima
     

29 Jul, 2022

4 commits

  • [ Upstream commit a11e5b3e7a59fde1a90b0eaeaa82320495cf8cae ]

    While reading sysctl_tcp_max_reordering, it can be changed
    concurrently. Thus, we need to add READ_ONCE() to its readers.

    Fixes: dca145ffaa8d ("tcp: allow for bigger reordering level")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 2d17d9c7382327d00aeaea35af44e9b26d53206e ]

    While reading sysctl_tcp_abort_on_overflow, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its reader.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 0b484c91911e758e53656d570de58c2ed81ec6f2 ]

    While reading sysctl_tcp_rfc1337, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its reader.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima
     
  • [ Upstream commit 4e08ed41cb1194009fc1a916a59ce3ed4afd77cd ]

    While reading sysctl_tcp_stdurg, it can be changed concurrently.
    Thus, we need to add READ_ONCE() to its reader.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Kuniyuki Iwashima