24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

29 Jul, 2020

1 commit

  • sockptr_advance never properly worked. Replace it with _offset variants
    of copy_from_sockptr and copy_to_sockptr.

    Fixes: ba423fdaa589 ("net: add a new sockptr_t type")
    Reported-by: Jason A. Donenfeld
    Reported-by: Ido Schimmel
    Signed-off-by: Christoph Hellwig
    Acked-by: Jason A. Donenfeld
    Tested-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Christoph Hellwig
     

25 Jul, 2020

1 commit

  • Rework the remaining setsockopt code to pass a sockptr_t instead of a
    plain user pointer. This removes the last remaining set_fs(KERNEL_DS)
    outside of architecture specific code.

    Signed-off-by: Christoph Hellwig
    Acked-by: Stefan Schmidt [ieee802154]
    Acked-by: Matthieu Baerts
    Signed-off-by: David S. Miller

    Christoph Hellwig
     

23 Jul, 2020

1 commit

  • This adds support for the SIOCOUTQ IOCTL to get the send buffer fill
    of a DCCP socket, like UDP and TCP sockets already have.

    Regarding the used data field: DCCP uses per packet sequence numbers,
    not per byte, so sequence numbers can't be used like in TCP. sk_wmem_queued
    is not used by DCCP and always 0, even in test on highly congested paths.
    Therefore this uses sk_wmem_alloc like in UDP.

    Signed-off-by: Richard Sailer
    Signed-off-by: David S. Miller

    Richard Sailer
     

20 Jul, 2020

1 commit


10 Jun, 2020

1 commit

  • There are some memory leaks in dccp_init() and dccp_fini().

    In dccp_fini() and the error handling path in dccp_init(), free lhash2
    is missing. Add inet_hashinfo2_free_mod() to do it.

    If inet_hashinfo2_init_mod() failed in dccp_init(),
    percpu_counter_destroy() should be called to destroy dccp_orphan_count.
    It need to goto out_free_percpu when inet_hashinfo2_init_mod() failed.

    Fixes: c92c81df93df ("net: dccp: fix kernel crash on module load")
    Reported-by: Hulk Robot
    Signed-off-by: Wang Hai
    Signed-off-by: David S. Miller

    Wang Hai
     

10 Dec, 2019

1 commit

  • Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except
    at places where these are defined. Later patches will remove the unused
    definition of FIELD_SIZEOF().

    This patch is generated using following script:

    EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h"

    git grep -l -e "\bFIELD_SIZEOF\b" | while read file;
    do

    if [[ "$file" =~ $EXCLUDE_FILES ]]; then
    continue
    fi
    sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file;
    done

    Signed-off-by: Pankaj Bharadiya
    Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com
    Co-developed-by: Kees Cook
    Signed-off-by: Kees Cook
    Acked-by: David Miller # for net

    Pankaj Bharadiya
     

07 Nov, 2019

1 commit


19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation #

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 4122 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Kate Stewart
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

13 May, 2019

1 commit


29 Dec, 2018

2 commits

  • totalram_pages and totalhigh_pages are made static inline function.

    Main motivation was that managed_page_count_lock handling was complicating
    things. It was discussed in length here,
    https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes
    better to remove the lock and convert variables to atomic, with preventing
    poteintial store-to-read tearing as a bonus.

    [akpm@linux-foundation.org: coding style fixes]
    Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.org
    Signed-off-by: Arun KS
    Suggested-by: Michal Hocko
    Suggested-by: Vlastimil Babka
    Reviewed-by: Konstantin Khlebnikov
    Reviewed-by: Pavel Tatashin
    Acked-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: David Hildenbrand
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun KS
     
  • Patch series "mm: convert totalram_pages, totalhigh_pages and managed
    pages to atomic", v5.

    This series converts totalram_pages, totalhigh_pages and
    zone->managed_pages to atomic variables.

    totalram_pages, zone->managed_pages and totalhigh_pages updates are
    protected by managed_page_count_lock, but readers never care about it.
    Convert these variables to atomic to avoid readers potentially seeing a
    store tear.

    Main motivation was that managed_page_count_lock handling was complicating
    things. It was discussed in length here,
    https://lore.kernel.org/patchwork/patch/995739/#1181785 It seemes better
    to remove the lock and convert variables to atomic. With the change,
    preventing poteintial store-to-read tearing comes as a bonus.

    This patch (of 4):

    This is in preparation to a later patch which converts totalram_pages and
    zone->managed_pages to atomic variables. Please note that re-reading the
    value might lead to a different value and as such it could lead to
    unexpected behavior. There are no known bugs as a result of the current
    code but it is better to prevent from them in principle.

    Link: http://lkml.kernel.org/r/1542090790-21750-2-git-send-email-arunks@codeaurora.org
    Signed-off-by: Arun KS
    Reviewed-by: Konstantin Khlebnikov
    Reviewed-by: David Hildenbrand
    Acked-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Reviewed-by: Pavel Tatashin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun KS
     

25 Dec, 2018

1 commit

  • Patch eedbbb0d98b2 "net: dccp: initialize (addr,port) ..."
    added calling to inet_hashinfo2_init() from dccp_init().

    However, inet_hashinfo2_init() is marked as __init(), and
    thus the kernel panics when dccp is loaded as module. Removing
    __init() tag from inet_hashinfo2_init() is not feasible because
    it calls into __init functions in mm.

    This patch adds inet_hashinfo2_init_mod() function that can
    be called after the init phase is done; changes dccp_init() to
    call the new function; un-marks inet_hashinfo2_init() as
    exported.

    Fixes: eedbbb0d98b2 ("net: dccp: initialize (addr,port) ...")
    Reported-by: kernel test robot
    Signed-off-by: Peter Oskolkov
    Signed-off-by: David S. Miller

    Peter Oskolkov
     

18 Dec, 2018

1 commit

  • Commit d9fbc7f6431f "net: tcp: prefer listeners bound to an address"
    removes port-only listener lookups. This caused segfaults in DCCP
    lookups because DCCP did not initialize the (addr,port) hashtable.

    This patch adds said initialization.

    The only non-trivial issue here is the size of the new hashtable.
    It seemed reasonable to make it match the size of the port-only
    hashtable (= INET_LHTABLE_SIZE) that was used previously. Other
    parameters to inet_hashinfo2_init() match those used in TCP.

    V2 changes: marked inet_hashinfo2_init as an exported symbol
    so that DCCP compiles when configured as a module.

    Tested: syzcaller issues fixed; the second patch in the patchset
    tests that DCCP lookups work correctly.

    Fixes: d9fbc7f6431f "net: tcp: prefer listeners bound to an address"
    Reported-by: syzcaller
    Signed-off-by: Peter Oskolkov
    Signed-off-by: David S. Miller

    Peter Oskolkov
     

17 Dec, 2018

2 commits

  • This reverts commit ec49d83f245453515a9b6e88324e27bbcb69fbae.

    Cause build failures when DCCP is modular.

    ERROR: "inet_hashinfo2_init" [net/dccp/dccp.ko] undefined!

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Commit d9fbc7f6431f "net: tcp: prefer listeners bound to an address"
    removes port-only listener lookups. This caused segfaults in DCCP
    lookups because DCCP did not initialize the (addr,port) hashtable.

    This patch adds said initialization.

    The only non-trivial issue here is the size of the new hashtable.
    It seemed reasonable to make it match the size of the port-only
    hashtable (= INET_LHTABLE_SIZE) that was used previously. Other
    parameters to inet_hashinfo2_init() match those used in TCP.

    Tested: syzcaller issues fixed; the second patch in the patchset
    tests that DCCP lookups work correctly.

    Fixes: d9fbc7f6431f "net: tcp: prefer listeners bound to an address"
    Reported-by: syzcaller
    Signed-off-by: Peter Oskolkov
    Signed-off-by: David S. Miller

    Peter Oskolkov
     

08 Nov, 2018

1 commit


24 Oct, 2018

1 commit

  • This reverts commit dd979b4df817e9976f18fb6f9d134d6bc4a3c317.

    This broke tcp_poll for SMC fallback: An AF_SMC socket establishes an
    internal TCP socket for the initial handshake with the remote peer.
    Whenever the SMC connection can not be established this TCP socket is
    used as a fallback. All socket operations on the SMC socket are then
    forwarded to the TCP socket. In case of poll, the file->private_data
    pointer references the SMC socket because the TCP socket has no file
    assigned. This causes tcp_poll to wait on the wrong socket.

    Signed-off-by: Karsten Graul
    Signed-off-by: David S. Miller

    Karsten Graul
     

31 Jul, 2018

1 commit


29 Jun, 2018

1 commit

  • The poll() changes were not well thought out, and completely
    unexplained. They also caused a huge performance regression, because
    "->poll()" was no longer a trivial file operation that just called down
    to the underlying file operations, but instead did at least two indirect
    calls.

    Indirect calls are sadly slow now with the Spectre mitigation, but the
    performance problem could at least be largely mitigated by changing the
    "->get_poll_head()" operation to just have a per-file-descriptor pointer
    to the poll head instead. That gets rid of one of the new indirections.

    But that doesn't fix the new complexity that is completely unwarranted
    for the regular case. The (undocumented) reason for the poll() changes
    was some alleged AIO poll race fixing, but we don't make the common case
    slower and more complex for some uncommon special case, so this all
    really needs way more explanations and most likely a fundamental
    redesign.

    [ This revert is a revert of about 30 different commits, not reverted
    individually because that would just be unnecessarily messy - Linus ]

    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

05 Jun, 2018

1 commit

  • Pull aio updates from Al Viro:
    "Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly.

    The only thing I'm holding back for a day or so is Adam's aio ioprio -
    his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case),
    but let it sit in -next for decency sake..."

    * 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    aio: sanitize the limit checking in io_submit(2)
    aio: fold do_io_submit() into callers
    aio: shift copyin of iocb into io_submit_one()
    aio_read_events_ring(): make a bit more readable
    aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way
    aio: take list removal to (some) callers of aio_complete()
    aio: add missing break for the IOCB_CMD_FDSYNC case
    random: convert to ->poll_mask
    timerfd: convert to ->poll_mask
    eventfd: switch to ->poll_mask
    pipe: convert to ->poll_mask
    crypto: af_alg: convert to ->poll_mask
    net/rxrpc: convert to ->poll_mask
    net/iucv: convert to ->poll_mask
    net/phonet: convert to ->poll_mask
    net/nfc: convert to ->poll_mask
    net/caif: convert to ->poll_mask
    net/bluetooth: convert to ->poll_mask
    net/sctp: convert to ->poll_mask
    net/tipc: convert to ->poll_mask
    ...

    Linus Torvalds
     

26 May, 2018

1 commit


23 May, 2018

1 commit

  • Syzbot reported the use-after-free in timer_is_static_object() [1].

    This can happen because the structure for the rto timer (ccid2_hc_tx_sock)
    is removed in dccp_disconnect(), and ccid2_hc_tx_rto_expire() can be
    called after that.

    The report [1] is similar to the one in commit 120e9dabaf55 ("dccp:
    defer ccid_hc_tx_delete() at dismantle time"). And the fix is the same,
    delay freeing ccid2_hc_tx_sock structure, so that it is freed in
    dccp_sk_destruct().

    [1]

    ==================================================================
    BUG: KASAN: use-after-free in timer_is_static_object+0x80/0x90
    kernel/time/timer.c:607
    Read of size 8 at addr ffff8801bebb5118 by task syz-executor2/25299

    CPU: 1 PID: 25299 Comm: syz-executor2 Not tainted 4.17.0-rc5+ #54
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    Call Trace:

    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x1b9/0x294 lib/dump_stack.c:113
    print_address_description+0x6c/0x20b mm/kasan/report.c:256
    kasan_report_error mm/kasan/report.c:354 [inline]
    kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
    __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
    timer_is_static_object+0x80/0x90 kernel/time/timer.c:607
    debug_object_activate+0x2d9/0x670 lib/debugobjects.c:508
    debug_timer_activate kernel/time/timer.c:709 [inline]
    debug_activate kernel/time/timer.c:764 [inline]
    __mod_timer kernel/time/timer.c:1041 [inline]
    mod_timer+0x4d3/0x13b0 kernel/time/timer.c:1102
    sk_reset_timer+0x22/0x60 net/core/sock.c:2742
    ccid2_hc_tx_rto_expire+0x587/0x680 net/dccp/ccids/ccid2.c:147
    call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
    expire_timers kernel/time/timer.c:1363 [inline]
    __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
    run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
    __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
    invoke_softirq kernel/softirq.c:365 [inline]
    irq_exit+0x1d1/0x200 kernel/softirq.c:405
    exiting_irq arch/x86/include/asm/apic.h:525 [inline]
    smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
    apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863

    ...
    Allocated by task 25374:
    save_stack+0x43/0xd0 mm/kasan/kasan.c:448
    set_track mm/kasan/kasan.c:460 [inline]
    kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
    kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
    kmem_cache_alloc+0x12e/0x760 mm/slab.c:3554
    ccid_new+0x25b/0x3e0 net/dccp/ccid.c:151
    dccp_hdlr_ccid+0x27/0x150 net/dccp/feat.c:44
    __dccp_feat_activate+0x184/0x270 net/dccp/feat.c:344
    dccp_feat_activate_values+0x3a7/0x819 net/dccp/feat.c:1538
    dccp_create_openreq_child+0x472/0x610 net/dccp/minisocks.c:128
    dccp_v4_request_recv_sock+0x12c/0xca0 net/dccp/ipv4.c:408
    dccp_v6_request_recv_sock+0x125d/0x1f10 net/dccp/ipv6.c:415
    dccp_check_req+0x455/0x6a0 net/dccp/minisocks.c:197
    dccp_v4_rcv+0x7b8/0x1f3f net/dccp/ipv4.c:841
    ip_local_deliver_finish+0x2e3/0xd80 net/ipv4/ip_input.c:215
    NF_HOOK include/linux/netfilter.h:288 [inline]
    ip_local_deliver+0x1e1/0x720 net/ipv4/ip_input.c:256
    dst_input include/net/dst.h:450 [inline]
    ip_rcv_finish+0x81b/0x2200 net/ipv4/ip_input.c:396
    NF_HOOK include/linux/netfilter.h:288 [inline]
    ip_rcv+0xb70/0x143d net/ipv4/ip_input.c:492
    __netif_receive_skb_core+0x26f5/0x3630 net/core/dev.c:4592
    __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4657
    process_backlog+0x219/0x760 net/core/dev.c:5337
    napi_poll net/core/dev.c:5735 [inline]
    net_rx_action+0x7b7/0x1930 net/core/dev.c:5801
    __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285

    Freed by task 25374:
    save_stack+0x43/0xd0 mm/kasan/kasan.c:448
    set_track mm/kasan/kasan.c:460 [inline]
    __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
    kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
    __cache_free mm/slab.c:3498 [inline]
    kmem_cache_free+0x86/0x2d0 mm/slab.c:3756
    ccid_hc_tx_delete+0xc3/0x100 net/dccp/ccid.c:190
    dccp_disconnect+0x130/0xc66 net/dccp/proto.c:286
    dccp_close+0x3bc/0xe60 net/dccp/proto.c:1045
    inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427
    inet6_release+0x50/0x70 net/ipv6/af_inet6.c:460
    sock_release+0x96/0x1b0 net/socket.c:594
    sock_close+0x16/0x20 net/socket.c:1149
    __fput+0x34d/0x890 fs/file_table.c:209
    ____fput+0x15/0x20 fs/file_table.c:243
    task_work_run+0x1e4/0x290 kernel/task_work.c:113
    tracehook_notify_resume include/linux/tracehook.h:191 [inline]
    exit_to_usermode_loop+0x2bd/0x310 arch/x86/entry/common.c:166
    prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
    syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
    do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    The buggy address belongs to the object at ffff8801bebb4cc0
    which belongs to the cache ccid2_hc_tx_sock of size 1240
    The buggy address is located 1112 bytes inside of
    1240-byte region [ffff8801bebb4cc0, ffff8801bebb5198)
    The buggy address belongs to the page:
    page:ffffea0006faed00 count:1 mapcount:0 mapping:ffff8801bebb41c0
    index:0xffff8801bebb5240 compound_mapcount: 0
    flags: 0x2fffc0000008100(slab|head)
    raw: 02fffc0000008100 ffff8801bebb41c0 ffff8801bebb5240 0000000100000003
    raw: ffff8801cdba3138 ffffea0007634120 ffff8801cdbaab40 0000000000000000
    page dumped because: kasan: bad access detected
    ...
    ==================================================================

    Reported-by: syzbot+5d47e9ec91a6f15dbd6f@syzkaller.appspotmail.com
    Signed-off-by: Alexey Kodanev
    Signed-off-by: David S. Miller

    Alexey Kodanev
     

08 Mar, 2018

1 commit

  • dccp_disconnect() sets 'dp->dccps_hc_tx_ccid' tx handler to NULL,
    therefore if DCCP socket is disconnected and dccp_sendmsg() is
    called after it, it will cause a NULL pointer dereference in
    dccp_write_xmit().

    This crash and the reproducer was reported by syzbot. Looks like
    it is reproduced if commit 69c64866ce07 ("dccp: CVE-2017-8824:
    use-after-free in DCCP code") is applied.

    Reported-by: syzbot+f99ab3887ab65d70f816@syzkaller.appspotmail.com
    Signed-off-by: Alexey Kodanev
    Signed-off-by: David S. Miller

    Alexey Kodanev
     

12 Feb, 2018

1 commit

  • This is the mindless scripted replacement of kernel use of POLL*
    variables as described by Al, done by this script:

    for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
    L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
    for f in $L; do sed -i "-es/^\([^\"]*\)\(\\)/\\1E\\2/" $f; done
    done

    with de-mangling cleanups yet to come.

    NOTE! On almost all architectures, the EPOLL* constants have the same
    values as the POLL* constants do. But they keyword here is "almost".
    For various bad reasons they aren't the same, and epoll() doesn't
    actually work quite correctly in some cases due to this on Sparc et al.

    The next patch from Al will sort out the final differences, and we
    should be all done.

    Scripted-by: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

01 Feb, 2018

1 commit

  • Pull networking updates from David Miller:

    1) Significantly shrink the core networking routing structures. Result
    of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf

    2) Add netdevsim driver for testing various offloads, from Jakub
    Kicinski.

    3) Support cross-chip FDB operations in DSA, from Vivien Didelot.

    4) Add a 2nd listener hash table for TCP, similar to what was done for
    UDP. From Martin KaFai Lau.

    5) Add eBPF based queue selection to tun, from Jason Wang.

    6) Lockless qdisc support, from John Fastabend.

    7) SCTP stream interleave support, from Xin Long.

    8) Smoother TCP receive autotuning, from Eric Dumazet.

    9) Lots of erspan tunneling enhancements, from William Tu.

    10) Add true function call support to BPF, from Alexei Starovoitov.

    11) Add explicit support for GRO HW offloading, from Michael Chan.

    12) Support extack generation in more netlink subsystems. From Alexander
    Aring, Quentin Monnet, and Jakub Kicinski.

    13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From
    Russell King.

    14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso.

    15) Many improvements and simplifications to the NFP driver bpf JIT,
    from Jakub Kicinski.

    16) Support for ipv6 non-equal cost multipath routing, from Ido
    Schimmel.

    17) Add resource abstration to devlink, from Arkadi Sharshevsky.

    18) Packet scheduler classifier shared filter block support, from Jiri
    Pirko.

    19) Avoid locking in act_csum, from Davide Caratti.

    20) devinet_ioctl() simplifications from Al viro.

    21) More TCP bpf improvements from Lawrence Brakmo.

    22) Add support for onlink ipv6 route flag, similar to ipv4, from David
    Ahern.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits)
    tls: Add support for encryption using async offload accelerator
    ip6mr: fix stale iterator
    net/sched: kconfig: Remove blank help texts
    openvswitch: meter: Use 64-bit arithmetic instead of 32-bit
    tcp_nv: fix potential integer overflow in tcpnv_acked
    r8169: fix RTL8168EP take too long to complete driver initialization.
    qmi_wwan: Add support for Quectel EP06
    rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK
    ipmr: Fix ptrdiff_t print formatting
    ibmvnic: Wait for device response when changing MAC
    qlcnic: fix deadlock bug
    tcp: release sk_frag.page in tcp_disconnect
    ipv4: Get the address of interface correctly.
    net_sched: gen_estimator: fix lockdep splat
    net: macb: Handle HRESP error
    net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring
    ipv6: addrconf: break critical section in addrconf_verify_rtnl()
    ipv6: change route cache aging logic
    i40e/i40evf: Update DESC_NEEDED value to reflect larger value
    bnxt_en: cleanup DIM work on device shutdown
    ...

    Linus Torvalds
     

31 Jan, 2018

1 commit

  • Pull poll annotations from Al Viro:
    "This introduces a __bitwise type for POLL### bitmap, and propagates
    the annotations through the tree. Most of that stuff is as simple as
    'make ->poll() instances return __poll_t and do the same to local
    variables used to hold the future return value'.

    Some of the obvious brainos found in process are fixed (e.g. POLLIN
    misspelled as POLL_IN). At that point the amount of sparse warnings is
    low and most of them are for genuine bugs - e.g. ->poll() instance
    deciding to return -EINVAL instead of a bitmap. I hadn't touched those
    in this series - it's large enough as it is.

    Another problem it has caught was eventpoll() ABI mess; select.c and
    eventpoll.c assumed that corresponding POLL### and EPOLL### were
    equal. That's true for some, but not all of them - EPOLL### are
    arch-independent, but POLL### are not.

    The last commit in this series separates userland POLL### values from
    the (now arch-independent) kernel-side ones, converting between them
    in the few places where they are copied to/from userland. AFAICS, this
    is the least disruptive fix preserving poll(2) ABI and making epoll()
    work on all architectures.

    As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
    it will trigger only on what would've triggered EPOLLWRBAND on other
    architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
    at all on sparc. With this patch they should work consistently on all
    architectures"

    * 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
    make kernel-side POLL... arch-independent
    eventpoll: no need to mask the result of epi_item_poll() again
    eventpoll: constify struct epoll_event pointers
    debugging printk in sg_poll() uses %x to print POLL... bitmap
    annotate poll(2) guts
    9p: untangle ->poll() mess
    ->si_band gets POLL... bitmap stored into a user-visible long field
    ring_buffer_poll_wait() return value used as return value of ->poll()
    the rest of drivers/*: annotate ->poll() instances
    media: annotate ->poll() instances
    fs: annotate ->poll() instances
    ipc, kernel, mm: annotate ->poll() instances
    net: annotate ->poll() instances
    apparmor: annotate ->poll() instances
    tomoyo: annotate ->poll() instances
    sound: annotate ->poll() instances
    acpi: annotate ->poll() instances
    crypto: annotate ->poll() instances
    block: annotate ->poll() instances
    x86: annotate ->poll() instances
    ...

    Linus Torvalds
     

03 Jan, 2018

1 commit


21 Dec, 2017

1 commit


06 Dec, 2017

1 commit


28 Nov, 2017

1 commit


17 Aug, 2017

1 commit

  • syszkaller team reported another problem in DCCP [1]

    Problem here is that the structure holding RTO timer
    (ccid2_hc_tx_rto_expire() handler) is freed too soon.

    We can not use del_timer_sync() to cancel the timer
    since this timer wants to grab socket lock (that would risk a dead lock)

    Solution is to defer the freeing of memory when all references to
    the socket were released. Socket timers do own a reference, so this
    should fix the issue.

    [1]

    ==================================================================
    BUG: KASAN: use-after-free in ccid2_hc_tx_rto_expire+0x51c/0x5c0 net/dccp/ccids/ccid2.c:144
    Read of size 4 at addr ffff8801d2660540 by task kworker/u4:7/3365

    CPU: 1 PID: 3365 Comm: kworker/u4:7 Not tainted 4.13.0-rc4+ #3
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Workqueue: events_unbound call_usermodehelper_exec_work
    Call Trace:

    __dump_stack lib/dump_stack.c:16 [inline]
    dump_stack+0x194/0x257 lib/dump_stack.c:52
    print_address_description+0x73/0x250 mm/kasan/report.c:252
    kasan_report_error mm/kasan/report.c:351 [inline]
    kasan_report+0x24e/0x340 mm/kasan/report.c:409
    __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:429
    ccid2_hc_tx_rto_expire+0x51c/0x5c0 net/dccp/ccids/ccid2.c:144
    call_timer_fn+0x233/0x830 kernel/time/timer.c:1268
    expire_timers kernel/time/timer.c:1307 [inline]
    __run_timers+0x7fd/0xb90 kernel/time/timer.c:1601
    run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
    __do_softirq+0x2f5/0xba3 kernel/softirq.c:284
    invoke_softirq kernel/softirq.c:364 [inline]
    irq_exit+0x1cc/0x200 kernel/softirq.c:405
    exiting_irq arch/x86/include/asm/apic.h:638 [inline]
    smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:1044
    apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:702
    RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:824 [inline]
    RIP: 0010:__raw_write_unlock_irq include/linux/rwlock_api_smp.h:267 [inline]
    RIP: 0010:_raw_write_unlock_irq+0x56/0x70 kernel/locking/spinlock.c:343
    RSP: 0018:ffff8801cd50eaa8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
    RAX: dffffc0000000000 RBX: ffffffff85a090c0 RCX: 0000000000000006
    RDX: 1ffffffff0b595f3 RSI: 1ffff1003962f989 RDI: ffffffff85acaf98
    RBP: ffff8801cd50eab0 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801cc96ea60
    R13: dffffc0000000000 R14: ffff8801cc96e4c0 R15: ffff8801cc96e4c0

    release_task+0xe9e/0x1a40 kernel/exit.c:220
    wait_task_zombie kernel/exit.c:1162 [inline]
    wait_consider_task+0x29b8/0x33c0 kernel/exit.c:1389
    do_wait_thread kernel/exit.c:1452 [inline]
    do_wait+0x441/0xa90 kernel/exit.c:1523
    kernel_wait4+0x1f5/0x370 kernel/exit.c:1665
    SYSC_wait4+0x134/0x140 kernel/exit.c:1677
    SyS_wait4+0x2c/0x40 kernel/exit.c:1673
    call_usermodehelper_exec_sync kernel/kmod.c:286 [inline]
    call_usermodehelper_exec_work+0x1a0/0x2c0 kernel/kmod.c:323
    process_one_work+0xbf3/0x1bc0 kernel/workqueue.c:2097
    worker_thread+0x223/0x1860 kernel/workqueue.c:2231
    kthread+0x35e/0x430 kernel/kthread.c:231
    ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:425

    Allocated by task 21267:
    save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
    save_stack+0x43/0xd0 mm/kasan/kasan.c:447
    set_track mm/kasan/kasan.c:459 [inline]
    kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
    kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
    kmem_cache_alloc+0x127/0x750 mm/slab.c:3561
    ccid_new+0x20e/0x390 net/dccp/ccid.c:151
    dccp_hdlr_ccid+0x27/0x140 net/dccp/feat.c:44
    __dccp_feat_activate+0x142/0x2a0 net/dccp/feat.c:344
    dccp_feat_activate_values+0x34e/0xa90 net/dccp/feat.c:1538
    dccp_rcv_request_sent_state_process net/dccp/input.c:472 [inline]
    dccp_rcv_state_process+0xed1/0x1620 net/dccp/input.c:677
    dccp_v4_do_rcv+0xeb/0x160 net/dccp/ipv4.c:679
    sk_backlog_rcv include/net/sock.h:911 [inline]
    __release_sock+0x124/0x360 net/core/sock.c:2269
    release_sock+0xa4/0x2a0 net/core/sock.c:2784
    inet_wait_for_connect net/ipv4/af_inet.c:557 [inline]
    __inet_stream_connect+0x671/0xf00 net/ipv4/af_inet.c:643
    inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:682
    SYSC_connect+0x204/0x470 net/socket.c:1642
    SyS_connect+0x24/0x30 net/socket.c:1623
    entry_SYSCALL_64_fastpath+0x1f/0xbe

    Freed by task 3049:
    save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
    save_stack+0x43/0xd0 mm/kasan/kasan.c:447
    set_track mm/kasan/kasan.c:459 [inline]
    kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
    __cache_free mm/slab.c:3503 [inline]
    kmem_cache_free+0x77/0x280 mm/slab.c:3763
    ccid_hc_tx_delete+0xc5/0x100 net/dccp/ccid.c:190
    dccp_destroy_sock+0x1d1/0x2b0 net/dccp/proto.c:225
    inet_csk_destroy_sock+0x166/0x3f0 net/ipv4/inet_connection_sock.c:833
    dccp_done+0xb7/0xd0 net/dccp/proto.c:145
    dccp_time_wait+0x13d/0x300 net/dccp/minisocks.c:72
    dccp_rcv_reset+0x1d1/0x5b0 net/dccp/input.c:160
    dccp_rcv_state_process+0x8fc/0x1620 net/dccp/input.c:663
    dccp_v4_do_rcv+0xeb/0x160 net/dccp/ipv4.c:679
    sk_backlog_rcv include/net/sock.h:911 [inline]
    __sk_receive_skb+0x33e/0xc00 net/core/sock.c:521
    dccp_v4_rcv+0xef1/0x1c00 net/dccp/ipv4.c:871
    ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
    NF_HOOK include/linux/netfilter.h:248 [inline]
    ip_local_deliver+0x1ce/0x6d0 net/ipv4/ip_input.c:257
    dst_input include/net/dst.h:477 [inline]
    ip_rcv_finish+0x8db/0x19c0 net/ipv4/ip_input.c:397
    NF_HOOK include/linux/netfilter.h:248 [inline]
    ip_rcv+0xc3f/0x17d0 net/ipv4/ip_input.c:488
    __netif_receive_skb_core+0x19af/0x33d0 net/core/dev.c:4417
    __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4455
    process_backlog+0x203/0x740 net/core/dev.c:5130
    napi_poll net/core/dev.c:5527 [inline]
    net_rx_action+0x792/0x1910 net/core/dev.c:5593
    __do_softirq+0x2f5/0xba3 kernel/softirq.c:284

    The buggy address belongs to the object at ffff8801d2660100
    which belongs to the cache ccid2_hc_tx_sock of size 1240
    The buggy address is located 1088 bytes inside of
    1240-byte region [ffff8801d2660100, ffff8801d26605d8)
    The buggy address belongs to the page:
    page:ffffea0007499800 count:1 mapcount:0 mapping:ffff8801d2660100 index:0x0 compound_mapcount: 0
    flags: 0x200000000008100(slab|head)
    raw: 0200000000008100 ffff8801d2660100 0000000000000000 0000000100000005
    raw: ffffea00075271a0 ffffea0007538820 ffff8801d3aef9c0 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
    ffff8801d2660400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ffff8801d2660480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff8801d2660500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ^
    ffff8801d2660580: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
    ffff8801d2660600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ==================================================================

    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Cc: Gerrit Renker
    Signed-off-by: David S. Miller

    Eric Dumazet
     

15 Aug, 2017

1 commit

  • syzkaller reported that DCCP could have a non empty
    write queue at dismantle time.

    WARNING: CPU: 1 PID: 2953 at net/core/stream.c:199 sk_stream_kill_queues+0x3ce/0x520 net/core/stream.c:199
    Kernel panic - not syncing: panic_on_warn set ...

    CPU: 1 PID: 2953 Comm: syz-executor0 Not tainted 4.13.0-rc4+ #2
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:16 [inline]
    dump_stack+0x194/0x257 lib/dump_stack.c:52
    panic+0x1e4/0x417 kernel/panic.c:180
    __warn+0x1c4/0x1d9 kernel/panic.c:541
    report_bug+0x211/0x2d0 lib/bug.c:183
    fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190
    do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
    do_trap+0x260/0x390 arch/x86/kernel/traps.c:273
    do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310
    do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323
    invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:846
    RIP: 0010:sk_stream_kill_queues+0x3ce/0x520 net/core/stream.c:199
    RSP: 0018:ffff8801d182f108 EFLAGS: 00010297
    RAX: ffff8801d1144140 RBX: ffff8801d13cb280 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffffffff85137b00 RDI: ffff8801d13cb280
    RBP: ffff8801d182f148 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801d13cb4d0
    R13: ffff8801d13cb3b8 R14: ffff8801d13cb300 R15: ffff8801d13cb3b8
    inet_csk_destroy_sock+0x175/0x3f0 net/ipv4/inet_connection_sock.c:835
    dccp_close+0x84d/0xc10 net/dccp/proto.c:1067
    inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
    sock_release+0x8d/0x1e0 net/socket.c:597
    sock_close+0x16/0x20 net/socket.c:1126
    __fput+0x327/0x7e0 fs/file_table.c:210
    ____fput+0x15/0x20 fs/file_table.c:246
    task_work_run+0x18a/0x260 kernel/task_work.c:116
    exit_task_work include/linux/task_work.h:21 [inline]
    do_exit+0xa32/0x1b10 kernel/exit.c:865
    do_group_exit+0x149/0x400 kernel/exit.c:969
    get_signal+0x7e8/0x17e0 kernel/signal.c:2330
    do_signal+0x94/0x1ee0 arch/x86/kernel/signal.c:808
    exit_to_usermode_loop+0x21c/0x2d0 arch/x86/entry/common.c:157
    prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
    syscall_return_slowpath+0x3a7/0x450 arch/x86/entry/common.c:263

    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Signed-off-by: David S. Miller

    Eric Dumazet
     

04 Nov, 2016

1 commit

  • Andrey reported following warning while fuzzing with syzkaller

    WARNING: CPU: 1 PID: 21072 at net/dccp/proto.c:83 dccp_set_state+0x229/0x290
    Kernel panic - not syncing: panic_on_warn set ...

    CPU: 1 PID: 21072 Comm: syz-executor Not tainted 4.9.0-rc1+ #293
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    ffff88003d4c7738 ffffffff81b474f4 0000000000000003 dffffc0000000000
    ffffffff844f8b00 ffff88003d4c7804 ffff88003d4c7800 ffffffff8140c06a
    0000000041b58ab3 ffffffff8479ab7d ffffffff8140beae ffffffff8140cd00
    Call Trace:
    [< inline >] __dump_stack lib/dump_stack.c:15
    [] dump_stack+0xb3/0x10f lib/dump_stack.c:51
    [] panic+0x1bc/0x39d kernel/panic.c:179
    [] __warn+0x1cc/0x1f0 kernel/panic.c:542
    [] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
    [] dccp_set_state+0x229/0x290 net/dccp/proto.c:83
    [] dccp_close+0x612/0xc10 net/dccp/proto.c:1016
    [] inet_release+0xef/0x1c0 net/ipv4/af_inet.c:415
    [] sock_release+0x8e/0x1d0 net/socket.c:570
    [] sock_close+0x16/0x20 net/socket.c:1017
    [] __fput+0x29d/0x720 fs/file_table.c:208
    [] ____fput+0x15/0x20 fs/file_table.c:244
    [] task_work_run+0xf8/0x170 kernel/task_work.c:116
    [< inline >] exit_task_work include/linux/task_work.h:21
    [] do_exit+0x883/0x2ac0 kernel/exit.c:828
    [] do_group_exit+0x10e/0x340 kernel/exit.c:931
    [] get_signal+0x634/0x15a0 kernel/signal.c:2307
    [] do_signal+0x8d/0x1a30 arch/x86/kernel/signal.c:807
    [] exit_to_usermode_loop+0xe5/0x130
    arch/x86/entry/common.c:156
    [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
    [] syscall_return_slowpath+0x1a8/0x1e0
    arch/x86/entry/common.c:259
    [] entry_SYSCALL_64_fastpath+0xc0/0xc2
    Dumping ftrace buffer:
    (ftrace buffer empty)
    Kernel Offset: disabled

    Fix this the same way we did for TCP in commit 565b7b2d2e63
    ("tcp: do not send reset to already closed sockets")

    Signed-off-by: Eric Dumazet
    Reported-by: Andrey Konovalov
    Tested-by: Andrey Konovalov
    Signed-off-by: David S. Miller

    Eric Dumazet
     

02 Dec, 2015

1 commit

  • This patch is a cleanup to make following patch easier to
    review.

    Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
    from (struct socket)->flags to a (struct socket_wq)->flags
    to benefit from RCU protection in sock_wake_async()

    To ease backports, we rename both constants.

    Two new helpers, sk_set_bit(int nr, struct sock *sk)
    and sk_clear_bit(int net, struct sock *sk) are added so that
    following patch can change their implementation.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

27 Jul, 2015

1 commit

  • Currently, tcp_recvmsg enters a busy loop in sk_wait_data if called
    with flags = MSG_WAITALL | MSG_PEEK.

    sk_wait_data waits for sk_receive_queue not empty, but in this case,
    the receive queue is not empty, but does not contain any skb that we
    can use.

    Add a "last skb seen on receive queue" argument to sk_wait_data, so
    that it sleeps until the receive queue has new skbs.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=99461
    Link: https://sourceware.org/bugzilla/show_bug.cgi?id=18493
    Link: https://bugzilla.redhat.com/show_bug.cgi?id=1205258
    Reported-by: Enrico Scholz
    Reported-by: Dan Searle
    Signed-off-by: Sabrina Dubroca
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     

03 Mar, 2015

1 commit

  • After TIPC doesn't depend on iocb argument in its internal
    implementations of sendmsg() and recvmsg() hooks defined in proto
    structure, no any user is using iocb argument in them at all now.
    Then we can drop the redundant iocb argument completely from kinds of
    implementations of both sendmsg() and recvmsg() in the entire
    networking stack.

    Cc: Christoph Hellwig
    Suggested-by: Al Viro
    Signed-off-by: Ying Xue
    Signed-off-by: David S. Miller

    Ying Xue
     

11 Dec, 2014

1 commit


24 Nov, 2014

1 commit


06 Nov, 2014

1 commit

  • This encapsulates all of the skb_copy_datagram_iovec() callers
    with call argument signature "skb, offset, msghdr->msg_iov, length".

    When we move to iov_iters in the networking, the iov_iter object will
    sit in the msghdr.

    Having a helper like this means there will be less places to touch
    during that transformation.

    Based upon descriptions and patch from Al Viro.

    Signed-off-by: David S. Miller

    David S. Miller