26 Jul, 2020

1 commit

  • The UDP reuseport conflict was a little bit tricky.

    The net-next code, via bpf-next, extracted the reuseport handling
    into a helper so that the BPF sk lookup code could invoke it.

    At the same time, the logic for reuseport handling of unconnected
    sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace
    which changed the logic to carry on the reuseport result into the
    rest of the lookup loop if we do not return immediately.

    This requires moving the reuseport_has_conns() logic into the callers.

    While we are here, get rid of inline directives as they do not belong
    in foo.c files.

    The other changes were cases of more straightforward overlapping
    modifications.

    Signed-off-by: David S. Miller

    David S. Miller
     

25 Jul, 2020

1 commit

  • Rework the remaining setsockopt code to pass a sockptr_t instead of a
    plain user pointer. This removes the last remaining set_fs(KERNEL_DS)
    outside of architecture specific code.

    Signed-off-by: Christoph Hellwig
    Acked-by: Stefan Schmidt [ieee802154]
    Acked-by: Matthieu Baerts
    Signed-off-by: David S. Miller

    Christoph Hellwig
     

24 Jul, 2020

1 commit

  • We recently added some bounds checking in ax25_connect() and
    ax25_sendmsg() and we so we removed the AX25_MAX_DIGIS checks because
    they were no longer required.

    Unfortunately, I believe they are required to prevent integer overflows
    so I have added them back.

    Fixes: 8885bb0621f0 ("AX.25: Prevent out-of-bounds read in ax25_sendmsg()")
    Fixes: 2f2a7ffad5c6 ("AX.25: Fix out-of-bounds read in ax25_connect()")
    Signed-off-by: Dan Carpenter
    Signed-off-by: David S. Miller

    Dan Carpenter
     

23 Jul, 2020

2 commits

  • Checks on `addr_len` and `usax->sax25_ndigis` are insufficient.
    ax25_sendmsg() can go out of bounds when `usax->sax25_ndigis` equals to 7
    or 8. Fix it.

    It is safe to remove `usax->sax25_ndigis > AX25_MAX_DIGIS`, since
    `addr_len` is guaranteed to be less than or equal to
    `sizeof(struct full_sockaddr_ax25)`

    Signed-off-by: Peilin Ye
    Signed-off-by: David S. Miller

    Peilin Ye
     
  • Checks on `addr_len` and `fsa->fsa_ax25.sax25_ndigis` are insufficient.
    ax25_connect() can go out of bounds when `fsa->fsa_ax25.sax25_ndigis`
    equals to 7 or 8. Fix it.

    This issue has been reported as a KMSAN uninit-value bug, because in such
    a case, ax25_connect() reaches into the uninitialized portion of the
    `struct sockaddr_storage` statically allocated in __sys_connect().

    It is safe to remove `fsa->fsa_ax25.sax25_ndigis > AX25_MAX_DIGIS` because
    `addr_len` is guaranteed to be less than or equal to
    `sizeof(struct full_sockaddr_ax25)`.

    Reported-by: syzbot+c82752228ed975b0a623@syzkaller.appspotmail.com
    Link: https://syzkaller.appspot.com/bug?id=55ef9d629f3b3d7d70b69558015b63b48d01af66
    Signed-off-by: Peilin Ye
    Signed-off-by: David S. Miller

    Peilin Ye
     

14 Jul, 2020

1 commit

  • Rationale:
    Reduces attack surface on kernel devs opening the links for MITM
    as HTTPS traffic is much harder to manipulate.

    Deterministic algorithm:
    For each file:
    If not .svg:
    For each line:
    If doesn't contain `\bxmlns\b`:
    For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
    If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
    If both the HTTP and HTTPS versions
    return 200 OK and serve the same content:
    Replace HTTP with HTTPS.

    Signed-off-by: Alexander A. Klimov
    Signed-off-by: David S. Miller

    Alexander A. Klimov
     

25 May, 2020

1 commit


21 May, 2020

1 commit

  • syzbot was able to trigger this trace [1], probably by using
    a zero optlen.

    While we are at it, cap optlen to IFNAMSIZ - 1 instead of IFNAMSIZ.

    [1]
    BUG: KMSAN: uninit-value in strnlen+0xf9/0x170 lib/string.c:569
    CPU: 0 PID: 8807 Comm: syz-executor483 Not tainted 5.7.0-rc4-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x1c9/0x220 lib/dump_stack.c:118
    kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:121
    __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
    strnlen+0xf9/0x170 lib/string.c:569
    dev_name_hash net/core/dev.c:207 [inline]
    netdev_name_node_lookup net/core/dev.c:277 [inline]
    __dev_get_by_name+0x75/0x2b0 net/core/dev.c:778
    ax25_setsockopt+0xfa3/0x1170 net/ax25/af_ax25.c:654
    __compat_sys_setsockopt+0x4ed/0x910 net/compat.c:403
    __do_compat_sys_setsockopt net/compat.c:413 [inline]
    __se_compat_sys_setsockopt+0xdd/0x100 net/compat.c:410
    __ia32_compat_sys_setsockopt+0x62/0x80 net/compat.c:410
    do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline]
    do_fast_syscall_32+0x3bf/0x6d0 arch/x86/entry/common.c:398
    entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139
    RIP: 0023:0xf7f57dd9
    Code: 90 e8 0b 00 00 00 f3 90 0f ae e8 eb f9 8d 74 26 00 89 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
    RSP: 002b:00000000ffae8c1c EFLAGS: 00000217 ORIG_RAX: 000000000000016e
    RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000000101
    RDX: 0000000000000019 RSI: 0000000020000000 RDI: 0000000000000004
    RBP: 0000000000000012 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

    Local variable ----devname@ax25_setsockopt created at:
    ax25_setsockopt+0xe6/0x1170 net/ax25/af_ax25.c:536
    ax25_setsockopt+0xe6/0x1170 net/ax25/af_ax25.c:536

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     

29 Apr, 2020

1 commit


10 Jan, 2020

1 commit

  • SK_PROTOCOL_MAX is only used in two places, for DECNet and AX.25. The
    limits have more to do with the those protocol definitions than they do
    with the data type of sk_protocol, so remove SK_PROTOCOL_MAX and use
    U8_MAX directly.

    Reviewed-by: Eric Dumazet
    Signed-off-by: Mat Martineau
    Signed-off-by: David S. Miller

    Mat Martineau
     

07 Nov, 2019

1 commit


24 Sep, 2019

1 commit


17 Jun, 2019

1 commit

  • Before thread in process context uses bh_lock_sock()
    we must disable bh.

    sysbot reported :

    WARNING: inconsistent lock state
    5.2.0-rc3+ #32 Not tainted

    inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
    blkid/26581 [HC0[0]:SC1[1]:HE1:SE0] takes:
    00000000e0da85ee (slock-AF_AX25){+.?.}, at: spin_lock include/linux/spinlock.h:338 [inline]
    00000000e0da85ee (slock-AF_AX25){+.?.}, at: ax25_destroy_timer+0x53/0xc0 net/ax25/af_ax25.c:275
    {SOFTIRQ-ON-W} state was registered at:
    lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:4303
    __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
    _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
    spin_lock include/linux/spinlock.h:338 [inline]
    ax25_rt_autobind+0x3ca/0x720 net/ax25/ax25_route.c:429
    ax25_connect.cold+0x30/0xa4 net/ax25/af_ax25.c:1221
    __sys_connect+0x264/0x330 net/socket.c:1834
    __do_sys_connect net/socket.c:1845 [inline]
    __se_sys_connect net/socket.c:1842 [inline]
    __x64_sys_connect+0x73/0xb0 net/socket.c:1842
    do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    irq event stamp: 2272
    hardirqs last enabled at (2272): [] trace_hardirqs_on_thunk+0x1a/0x1c
    hardirqs last disabled at (2271): [] trace_hardirqs_off_thunk+0x1a/0x1c
    softirqs last enabled at (1522): [] __do_softirq+0x654/0x94c kernel/softirq.c:320
    softirqs last disabled at (2267): [] invoke_softirq kernel/softirq.c:374 [inline]
    softirqs last disabled at (2267): [] irq_exit+0x180/0x1d0 kernel/softirq.c:414

    other info that might help us debug this:
    Possible unsafe locking scenario:

    CPU0
    ----
    lock(slock-AF_AX25);

    lock(slock-AF_AX25);

    *** DEADLOCK ***

    1 lock held by blkid/26581:
    #0: 0000000010fd154d ((&ax25->dtimer)){+.-.}, at: lockdep_copy_map include/linux/lockdep.h:175 [inline]
    #0: 0000000010fd154d ((&ax25->dtimer)){+.-.}, at: call_timer_fn+0xe0/0x720 kernel/time/timer.c:1312

    stack backtrace:
    CPU: 1 PID: 26581 Comm: blkid Not tainted 5.2.0-rc3+ #32
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:

    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x172/0x1f0 lib/dump_stack.c:113
    print_usage_bug.cold+0x393/0x4a2 kernel/locking/lockdep.c:2935
    valid_state kernel/locking/lockdep.c:2948 [inline]
    mark_lock_irq kernel/locking/lockdep.c:3138 [inline]
    mark_lock+0xd46/0x1370 kernel/locking/lockdep.c:3513
    mark_irqflags kernel/locking/lockdep.c:3391 [inline]
    __lock_acquire+0x159f/0x5490 kernel/locking/lockdep.c:3745
    lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:4303
    __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
    _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
    spin_lock include/linux/spinlock.h:338 [inline]
    ax25_destroy_timer+0x53/0xc0 net/ax25/af_ax25.c:275
    call_timer_fn+0x193/0x720 kernel/time/timer.c:1322
    expire_timers kernel/time/timer.c:1366 [inline]
    __run_timers kernel/time/timer.c:1685 [inline]
    __run_timers kernel/time/timer.c:1653 [inline]
    run_timer_softirq+0x66f/0x1740 kernel/time/timer.c:1698
    __do_softirq+0x25c/0x94c kernel/softirq.c:293
    invoke_softirq kernel/softirq.c:374 [inline]
    irq_exit+0x180/0x1d0 kernel/softirq.c:414
    exiting_irq arch/x86/include/asm/apic.h:536 [inline]
    smp_apic_timer_interrupt+0x13b/0x550 arch/x86/kernel/apic/apic.c:1068
    apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:806

    RIP: 0033:0x7f858d5c3232
    Code: 8b 61 08 48 8b 84 24 d8 00 00 00 4c 89 44 24 28 48 8b ac 24 d0 00 00 00 4c 8b b4 24 e8 00 00 00 48 89 7c 24 68 48 89 4c 24 78 89 44 24 58 8b 84 24 e0 00 00 00 89 84 24 84 00 00 00 8b 84 24
    RSP: 002b:00007ffcaf0cf5c0 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
    RAX: 00007f858d7d27a8 RBX: 00007f858d7d8820 RCX: 00007f858d3940d8
    RDX: 00007ffcaf0cf798 RSI: 00000000f5e616f3 RDI: 00007f858d394fee
    RBP: 0000000000000000 R08: 00007ffcaf0cf780 R09: 00007f858d7db480
    R10: 0000000000000000 R11: 0000000009691a75 R12: 0000000000000005
    R13: 00000000f5e616f3 R14: 0000000000000000 R15: 00007ffcaf0cf798

    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

21 May, 2019

1 commit


22 Apr, 2019

1 commit

  • Pointers should be printed with %p or %px rather than
    cast to long type and printed with %8.8lx.
    Change %8.8lx to %p to print the pointer.

    Signed-off-by: Fuqian Huang
    Signed-off-by: David S. Miller

    Fuqian Huang
     

20 Apr, 2019

1 commit

  • The SIOCGSTAMP/SIOCGSTAMPNS ioctl commands are implemented by many
    socket protocol handlers, and all of those end up calling the same
    sock_get_timestamp()/sock_get_timestampns() helper functions, which
    results in a lot of duplicate code.

    With the introduction of 64-bit time_t on 32-bit architectures, this
    gets worse, as we then need four different ioctl commands in each
    socket protocol implementation.

    To simplify that, let's add a new .gettstamp() operation in
    struct proto_ops, and move ioctl implementation into the common
    sock_ioctl()/compat_sock_ioctl_trans() functions that these all go
    through.

    We can reuse the sock_get_timestamp() implementation, but generalize
    it so it can deal with both native and compat mode, as well as
    timeval and timespec structures.

    Acked-by: Stefan Schmidt
    Acked-by: Neil Horman
    Acked-by: Marc Kleine-Budde
    Link: https://lore.kernel.org/lkml/CAK8P3a038aDQQotzua_QtKGhq8O9n+rdiz2=WDCp82ys8eUT+A@mail.gmail.com/
    Signed-off-by: Arnd Bergmann
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

24 Jan, 2019

1 commit

  • syzbot found that ax25 routes where not properly protected
    against concurrent use [1].

    In this particular report the bug happened while
    copying ax25->digipeat.

    Fix this problem by making sure we call ax25_get_route()
    while ax25_route_lock is held, so that no modification
    could happen while using the route.

    The current two ax25_get_route() callers do not sleep,
    so this change should be fine.

    Once we do that, ax25_get_route() no longer needs to
    grab a reference on the found route.

    [1]
    ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de
    BUG: KASAN: use-after-free in memcpy include/linux/string.h:352 [inline]
    BUG: KASAN: use-after-free in kmemdup+0x42/0x60 mm/util.c:113
    Read of size 66 at addr ffff888066641a80 by task syz-executor2/531

    ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de
    CPU: 1 PID: 531 Comm: syz-executor2 Not tainted 5.0.0-rc2+ #10
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x1db/0x2d0 lib/dump_stack.c:113
    print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
    kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
    check_memory_region_inline mm/kasan/generic.c:185 [inline]
    check_memory_region+0x123/0x190 mm/kasan/generic.c:191
    memcpy+0x24/0x50 mm/kasan/common.c:130
    memcpy include/linux/string.h:352 [inline]
    kmemdup+0x42/0x60 mm/util.c:113
    kmemdup include/linux/string.h:425 [inline]
    ax25_rt_autobind+0x25d/0x750 net/ax25/ax25_route.c:424
    ax25_connect.cold+0x30/0xa4 net/ax25/af_ax25.c:1224
    __sys_connect+0x357/0x490 net/socket.c:1664
    __do_sys_connect net/socket.c:1675 [inline]
    __se_sys_connect net/socket.c:1672 [inline]
    __x64_sys_connect+0x73/0xb0 net/socket.c:1672
    do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x458099
    Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007f870ee22c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
    RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099
    RDX: 0000000000000048 RSI: 0000000020000080 RDI: 0000000000000005
    RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
    ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007f870ee236d4
    R13: 00000000004be48e R14: 00000000004ce9a8 R15: 00000000ffffffff

    Allocated by task 526:
    save_stack+0x45/0xd0 mm/kasan/common.c:73
    set_track mm/kasan/common.c:85 [inline]
    __kasan_kmalloc mm/kasan/common.c:496 [inline]
    __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:469
    kasan_kmalloc+0x9/0x10 mm/kasan/common.c:504
    ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de
    kmem_cache_alloc_trace+0x151/0x760 mm/slab.c:3609
    kmalloc include/linux/slab.h:545 [inline]
    ax25_rt_add net/ax25/ax25_route.c:95 [inline]
    ax25_rt_ioctl+0x3b9/0x1270 net/ax25/ax25_route.c:233
    ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763
    sock_do_ioctl+0xe2/0x400 net/socket.c:950
    sock_ioctl+0x32f/0x6c0 net/socket.c:1074
    vfs_ioctl fs/ioctl.c:46 [inline]
    file_ioctl fs/ioctl.c:509 [inline]
    do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696
    ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
    __do_sys_ioctl fs/ioctl.c:720 [inline]
    __se_sys_ioctl fs/ioctl.c:718 [inline]
    __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
    do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de
    Freed by task 550:
    save_stack+0x45/0xd0 mm/kasan/common.c:73
    set_track mm/kasan/common.c:85 [inline]
    __kasan_slab_free+0x102/0x150 mm/kasan/common.c:458
    kasan_slab_free+0xe/0x10 mm/kasan/common.c:466
    __cache_free mm/slab.c:3487 [inline]
    kfree+0xcf/0x230 mm/slab.c:3806
    ax25_rt_add net/ax25/ax25_route.c:92 [inline]
    ax25_rt_ioctl+0x304/0x1270 net/ax25/ax25_route.c:233
    ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763
    sock_do_ioctl+0xe2/0x400 net/socket.c:950
    sock_ioctl+0x32f/0x6c0 net/socket.c:1074
    vfs_ioctl fs/ioctl.c:46 [inline]
    file_ioctl fs/ioctl.c:509 [inline]
    do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696
    ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
    __do_sys_ioctl fs/ioctl.c:720 [inline]
    __se_sys_ioctl fs/ioctl.c:718 [inline]
    __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
    do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    The buggy address belongs to the object at ffff888066641a80
    which belongs to the cache kmalloc-96 of size 96
    The buggy address is located 0 bytes inside of
    96-byte region [ffff888066641a80, ffff888066641ae0)
    The buggy address belongs to the page:
    page:ffffea0001999040 count:1 mapcount:0 mapping:ffff88812c3f04c0 index:0x0
    flags: 0x1fffc0000000200(slab)
    ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de
    raw: 01fffc0000000200 ffffea0001817948 ffffea0002341dc8 ffff88812c3f04c0
    raw: 0000000000000000 ffff888066641000 0000000100000020 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
    ffff888066641980: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
    ffff888066641a00: 00 00 00 00 00 00 00 00 02 fc fc fc fc fc fc fc
    >ffff888066641a80: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
    ^
    ffff888066641b00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
    ffff888066641b80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc

    Signed-off-by: Eric Dumazet
    Cc: Ralf Baechle
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     

31 Dec, 2018

1 commit

  • There are multiple issues here:

    1. After freeing dev->ax25_ptr, we need to set it to NULL otherwise
    we may use a dangling pointer.

    2. There is a race between ax25_setsockopt() and device notifier as
    reported by syzbot. Close it by holding RTNL lock.

    3. We need to test if dev->ax25_ptr is NULL before using it.

    Reported-and-tested-by: syzbot+ae6bb869cbed29b29040@syzkaller.appspotmail.com
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

25 Jul, 2018

1 commit


29 Jun, 2018

1 commit

  • The poll() changes were not well thought out, and completely
    unexplained. They also caused a huge performance regression, because
    "->poll()" was no longer a trivial file operation that just called down
    to the underlying file operations, but instead did at least two indirect
    calls.

    Indirect calls are sadly slow now with the Spectre mitigation, but the
    performance problem could at least be largely mitigated by changing the
    "->get_poll_head()" operation to just have a per-file-descriptor pointer
    to the poll head instead. That gets rid of one of the new indirections.

    But that doesn't fix the new complexity that is completely unwarranted
    for the regular case. The (undocumented) reason for the poll() changes
    was some alleged AIO poll race fixing, but we don't make the common case
    slower and more complex for some uncommon special case, so this all
    really needs way more explanations and most likely a fundamental
    redesign.

    [ This revert is a revert of about 30 different commits, not reverted
    individually because that would just be unnecessarily messy - Linus ]

    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

05 Jun, 2018

1 commit

  • Pull aio updates from Al Viro:
    "Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly.

    The only thing I'm holding back for a day or so is Adam's aio ioprio -
    his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case),
    but let it sit in -next for decency sake..."

    * 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    aio: sanitize the limit checking in io_submit(2)
    aio: fold do_io_submit() into callers
    aio: shift copyin of iocb into io_submit_one()
    aio_read_events_ring(): make a bit more readable
    aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way
    aio: take list removal to (some) callers of aio_complete()
    aio: add missing break for the IOCB_CMD_FDSYNC case
    random: convert to ->poll_mask
    timerfd: convert to ->poll_mask
    eventfd: switch to ->poll_mask
    pipe: convert to ->poll_mask
    crypto: af_alg: convert to ->poll_mask
    net/rxrpc: convert to ->poll_mask
    net/iucv: convert to ->poll_mask
    net/phonet: convert to ->poll_mask
    net/nfc: convert to ->poll_mask
    net/caif: convert to ->poll_mask
    net/bluetooth: convert to ->poll_mask
    net/sctp: convert to ->poll_mask
    net/tipc: convert to ->poll_mask
    ...

    Linus Torvalds
     

26 May, 2018

1 commit


16 May, 2018

1 commit


27 Mar, 2018

1 commit

  • Prefer the direct use of octal for permissions.

    Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace
    and some typing.

    Miscellanea:

    o Whitespace neatening around these conversions.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

13 Feb, 2018

1 commit

  • Changes since v1:
    Added changes in these files:
    drivers/infiniband/hw/usnic/usnic_transport.c
    drivers/staging/lustre/lnet/lnet/lib-socket.c
    drivers/target/iscsi/iscsi_target_login.c
    drivers/vhost/net.c
    fs/dlm/lowcomms.c
    fs/ocfs2/cluster/tcp.c
    security/tomoyo/network.c

    Before:
    All these functions either return a negative error indicator,
    or store length of sockaddr into "int *socklen" parameter
    and return zero on success.

    "int *socklen" parameter is awkward. For example, if caller does not
    care, it still needs to provide on-stack storage for the value
    it does not need.

    None of the many FOO_getname() functions of various protocols
    ever used old value of *socklen. They always just overwrite it.

    This change drops this parameter, and makes all these functions, on success,
    return length of sockaddr. It's always >= 0 and can be differentiated
    from an error.

    Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.

    rpc_sockname() lost "int buflen" parameter, since its only use was
    to be passed to kernel_getsockname() as &buflen and subsequently
    not used in any way.

    Userspace API is not changed.

    text data bss dec hex filename
    30108430 2633624 873672 33615726 200ef6e vmlinux.before.o
    30108109 2633612 873672 33615393 200ee21 vmlinux.o

    Signed-off-by: Denys Vlasenko
    CC: David S. Miller
    CC: linux-kernel@vger.kernel.org
    CC: netdev@vger.kernel.org
    CC: linux-bluetooth@vger.kernel.org
    CC: linux-decnet-user@lists.sourceforge.net
    CC: linux-wireless@vger.kernel.org
    CC: linux-rdma@vger.kernel.org
    CC: linux-sctp@vger.kernel.org
    CC: linux-nfs@vger.kernel.org
    CC: linux-x25@vger.kernel.org
    Signed-off-by: David S. Miller

    Denys Vlasenko
     

17 Jan, 2018

1 commit

  • /proc has been ignoring struct file_operations::owner field for 10 years.
    Specifically, it started with commit 786d7e1612f0b0adb6046f19b906609e4fe8b1ba
    ("Fix rmmod/read/write races in /proc entries"). Notice the chunk where
    inode->i_fop is initialized with proxy struct file_operations for
    regular files:

    - if (de->proc_fops)
    - inode->i_fop = de->proc_fops;
    + if (de->proc_fops) {
    + if (S_ISREG(inode->i_mode))
    + inode->i_fop = &proc_reg_file_ops;
    + else
    + inode->i_fop = de->proc_fops;
    + }

    VFS stopped pinning module at this point.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

04 Nov, 2017

1 commit


02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

25 Oct, 2017

1 commit

  • In preparation for unconditionally passing the struct timer_list pointer to
    all timer callbacks, switch to using the new timer_setup() and from_timer()
    to pass the timer pointer explicitly.

    Cc: Joerg Reuter
    Cc: Ralf Baechle
    Cc: "David S. Miller"
    Cc: linux-hams@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Signed-off-by: Kees Cook
    Signed-off-by: David S. Miller

    Kees Cook
     

05 Jul, 2017

3 commits

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     
  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     
  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

16 Jun, 2017

1 commit

  • It seems like a historic accident that these return unsigned char *,
    and in many places that means casts are required, more often than not.

    Make these functions return void * and remove all the casts across
    the tree, adding a (u8 *) cast only where the unsigned char pointer
    was used directly, all done with the following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    @@
    expression SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - fn(SKB, LEN)[0]
    + *(u8 *)fn(SKB, LEN)

    Note that the last part there converts from push(...)[0] to the
    more idiomatic *(u8 *)push(...).

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

10 Mar, 2017

1 commit

  • Lockdep issues a circular dependency warning when AFS issues an operation
    through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.

    The theory lockdep comes up with is as follows:

    (1) If the pagefault handler decides it needs to read pages from AFS, it
    calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
    creating a call requires the socket lock:

    mmap_sem must be taken before sk_lock-AF_RXRPC

    (2) afs_open_socket() opens an AF_RXRPC socket and binds it. rxrpc_bind()
    binds the underlying UDP socket whilst holding its socket lock.
    inet_bind() takes its own socket lock:

    sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET

    (3) Reading from a TCP socket into a userspace buffer might cause a fault
    and thus cause the kernel to take the mmap_sem, but the TCP socket is
    locked whilst doing this:

    sk_lock-AF_INET must be taken before mmap_sem

    However, lockdep's theory is wrong in this instance because it deals only
    with lock classes and not individual locks. The AF_INET lock in (2) isn't
    really equivalent to the AF_INET lock in (3) as the former deals with a
    socket entirely internal to the kernel that never sees userspace. This is
    a limitation in the design of lockdep.

    Fix the general case by:

    (1) Double up all the locking keys used in sockets so that one set are
    used if the socket is created by userspace and the other set is used
    if the socket is created by the kernel.

    (2) Store the kern parameter passed to sk_alloc() in a variable in the
    sock struct (sk_kern_sock). This informs sock_lock_init(),
    sock_init_data() and sk_clone_lock() as to the lock keys to be used.

    Note that the child created by sk_clone_lock() inherits the parent's
    kern setting.

    (3) Add a 'kern' parameter to ->accept() that is analogous to the one
    passed in to ->create() that distinguishes whether kernel_accept() or
    sys_accept4() was the caller and can be passed to sk_alloc().

    Note that a lot of accept functions merely dequeue an already
    allocated socket. I haven't touched these as the new socket already
    exists before we get the parameter.

    Note also that there are a couple of places where I've made the accepted
    socket unconditionally kernel-based:

    irda_accept()
    rds_rcp_accept_one()
    tcp_accept_from_sock()

    because they follow a sock_create_kern() and accept off of that.

    Whilst creating this, I noticed that lustre and ocfs don't create sockets
    through sock_create_kern() and thus they aren't marked as for-kernel,
    though they appear to be internal. I wonder if these should do that so
    that they use the new set of lock keys.

    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

02 Mar, 2017

1 commit


17 Jan, 2017

1 commit

  • The ax.25 socket connection timed out & the sock struct has been
    previously taken down ie. sock struct is now a NULL pointer. Checking
    the sock_flag causes the segfault. Check if the socket struct pointer
    is NULL before checking sock_flag. This segfault is seen in
    timed out netrom connections.

    Please submit to -stable.

    Signed-off-by: Basil Gunn
    Signed-off-by: David S. Miller

    Basil Gunn
     

25 Dec, 2016

1 commit


19 Jun, 2016

1 commit

  • A socket connection made in ax.25 is not closed when session is
    completed. The heartbeat timer is stopped prematurely and this is
    where the socket gets closed. Allow heatbeat timer to run to close
    socket. Symptom occurs in kernels >= 4.2.0

    Originally sent 6/15/2016. Resend with distribution list matching
    scripts/maintainer.pl output.

    Signed-off-by: Basil Gunn
    Signed-off-by: David S. Miller

    Basil Gunn
     

10 Mar, 2016

1 commit

  • As variable length protocol, AX25 fails link layer header validation
    tests based on a minimum length. header_ops.validate allows protocols
    to validate headers that are shorter than hard_header_len. Implement
    this callback for AX25.

    See also http://comments.gmane.org/gmane.linux.network/401064

    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn