11 Feb, 2020

1 commit

  • [ Upstream commit 0d0d9a388a858e271bb70e71e99e7fe2a6fd6f64 ]

    In the past it was possible to create multiple L2TPv3 sessions with the
    same session id as long as the sessions belonged to different tunnels.
    The resulting sessions had issues when used with IP encapsulated tunnels,
    but worked fine with UDP encapsulated ones. Some applications began to
    rely on this behaviour to avoid having to negotiate unique session ids.

    Some time ago a change was made to require session ids to be unique across
    all tunnels, breaking the applications making use of this "feature".

    This change relaxes the duplicate session id check to allow duplicates
    if both of the colliding sessions belong to UDP encapsulated tunnels.

    Fixes: dbdbc73b4478 ("l2tp: fix duplicate session creation")
    Signed-off-by: Ridge Kennedy
    Acked-by: James Chapman
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Ridge Kennedy
     

18 Dec, 2019

1 commit

  • [ Upstream commit c4e85f73afb6384123e5ef1bba3315b2e3ad031e ]

    This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow,
    as some modules currently pass a net argument without a socket to
    ip6_dst_lookup. This is equivalent to commit 343d60aada5a ("ipv6: change
    ipv6_stub_impl.ipv6_dst_lookup to take net argument").

    Signed-off-by: Sabrina Dubroca
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Sabrina Dubroca
     

25 Oct, 2019

1 commit

  • Some interface types could be nested.
    (VLAN, BONDING, TEAM, MACSEC, MACVLAN, IPVLAN, VIRT_WIFI, VXLAN, etc..)
    These interface types should set lockdep class because, without lockdep
    class key, lockdep always warn about unexisting circular locking.

    In the current code, these interfaces have their own lockdep class keys and
    these manage itself. So that there are so many duplicate code around the
    /driver/net and /net/.
    This patch adds new generic lockdep keys and some helper functions for it.

    This patch does below changes.
    a) Add lockdep class keys in struct net_device
    - qdisc_running, xmit, addr_list, qdisc_busylock
    - these keys are used as dynamic lockdep key.
    b) When net_device is being allocated, lockdep keys are registered.
    - alloc_netdev_mqs()
    c) When net_device is being free'd llockdep keys are unregistered.
    - free_netdev()
    d) Add generic lockdep key helper function
    - netdev_register_lockdep_key()
    - netdev_unregister_lockdep_key()
    - netdev_update_lockdep_key()
    e) Remove unnecessary generic lockdep macro and functions
    f) Remove unnecessary lockdep code of each interfaces.

    After this patch, each interface modules don't need to maintain
    their lockdep keys.

    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller

    Taehee Yoo
     

02 Oct, 2019

1 commit

  • commit 174e23810cd31
    ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi
    recycle always drop skb extensions. The additional skb_ext_del() that is
    performed via nf_reset on napi skb recycle is not needed anymore.

    Most nf_reset() calls in the stack are there so queued skb won't block
    'rmmod nf_conntrack' indefinitely.

    This removes the skb_ext_del from nf_reset, and renames it to a more
    fitting nf_reset_ct().

    In a few selected places, add a call to skb_ext_reset to make sure that
    no active extensions remain.

    I am submitting this for "net", because we're still early in the release
    cycle. The patch applies to net-next too, but I think the rename causes
    needless divergence between those trees.

    Suggested-by: Eric Dumazet
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

31 Jul, 2019

1 commit

  • Support for handling the PPPOEIOCSFWD ioctl in compat mode was added in
    linux-2.5.69 along with hundreds of other commands, but was always broken
    sincen only the structure is compatible, but the command number is not,
    due to the size being sizeof(size_t), or at first sizeof(sizeof((struct
    sockaddr_pppox)), which is different on 64-bit architectures.

    Guillaume Nault adds:

    And the implementation was broken until 2016 (see 29e73269aa4d ("pppoe:
    fix reference counting in PPPoE proxy")), and nobody ever noticed. I
    should probably have removed this ioctl entirely instead of fixing it.
    Clearly, it has never been used.

    Fix it by adding a compat_ioctl handler for all pppoe variants that
    translates the command number and then calls the regular ioctl function.

    All other ioctl commands handled by pppoe are compatible between 32-bit
    and 64-bit, and require compat_ptr() conversion.

    This should apply to all stable kernels.

    Acked-by: Guillaume Nault
    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

09 Jul, 2019

1 commit

  • Processes can request ipv6 flowlabels with cmsg IPV6_FLOWINFO.
    If not set, by default an autogenerated flowlabel is selected.

    Explicit flowlabels require a control operation per label plus a
    datapath check on every connection (every datagram if unconnected).
    This is particularly expensive on unconnected sockets multiplexing
    many flows, such as QUIC.

    In the common case, where no lease is exclusive, the check can be
    safely elided, as both lease request and check trivially succeed.
    Indeed, autoflowlabel does the same even with exclusive leases.

    Elide the check if no process has requested an exclusive lease.

    fl6_sock_lookup previously returns either a reference to a lease or
    NULL to denote failure. Modify to return a real error and update
    all callers. On return NULL, they can use the label and will elide
    the atomic_dec in fl6_sock_release.

    This is an optimization. Robust applications still have to revert to
    requesting leases if the fast path fails due to an exclusive lease.

    Changes RFC->v1:
    - use static_key_false_deferred to rate limit jump label operations
    - call static_key_deferred_flush to stop timers on exit
    - move decrement out of RCU context
    - defer optimization also if opt data is associated with a lease
    - updated all fp6_sock_lookup callers, not just udp

    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

22 Jun, 2019

1 commit


19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation #

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 4122 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Kate Stewart
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

14 Jun, 2019

1 commit

  • When calling debugfs functions, there is no need to ever check the
    return value. The function can work or not, but the code logic should
    never do something different based on this.

    Also, there is no need to store the individual debugfs file name, just
    remove the whole directory all at once, saving a local variable.

    Cc: "David S. Miller"
    Cc: Guillaume Nault
    Cc: netdev@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman
    Acked-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Greg Kroah-Hartman
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

21 May, 2019

1 commit


08 May, 2019

2 commits

  • Minor conflict with the DSA legacy code removal.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • BUG: unable to handle kernel NULL pointer dereference at 0000000000000128
    PGD 0 P4D 0
    Oops: 0000 [#1
    CPU: 0 PID: 5697 Comm: modprobe Tainted: G W 5.1.0-rc7+ #1
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    RIP: 0010:__lock_acquire+0x53/0x10b0
    Code: 8b 1c 25 40 5e 01 00 4c 8b 6d 10 45 85 e4 0f 84 bd 06 00 00 44 8b 1d 7c d2 09 02 49 89 fe 41 89 d2 45 85 db 0f 84 47 02 00 00 81 3f a0 05 70 83 b8 00 00 00 00 44 0f 44 c0 83 fe 01 0f 86 3a
    RSP: 0018:ffffc90001c07a28 EFLAGS: 00010002
    RAX: 0000000000000000 RBX: ffff88822f038440 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000128
    RBP: ffffc90001c07a88 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
    R13: 0000000000000000 R14: 0000000000000128 R15: 0000000000000000
    FS: 00007fead0811540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000128 CR3: 00000002310da000 CR4: 00000000000006f0
    Call Trace:
    ? __lock_acquire+0x24e/0x10b0
    lock_acquire+0xdf/0x230
    ? flush_workqueue+0x71/0x530
    flush_workqueue+0x97/0x530
    ? flush_workqueue+0x71/0x530
    l2tp_exit_net+0x170/0x2b0 [l2tp_core
    ? l2tp_exit_net+0x93/0x2b0 [l2tp_core
    ops_exit_list.isra.6+0x36/0x60
    unregister_pernet_operations+0xb8/0x110
    unregister_pernet_device+0x25/0x40
    l2tp_init+0x55/0x1000 [l2tp_core
    ? 0xffffffffa018d000
    do_one_initcall+0x6c/0x3cc
    ? do_init_module+0x22/0x1f1
    ? rcu_read_lock_sched_held+0x97/0xb0
    ? kmem_cache_alloc_trace+0x325/0x3b0
    do_init_module+0x5b/0x1f1
    load_module+0x1db1/0x2690
    ? m_show+0x1d0/0x1d0
    __do_sys_finit_module+0xc5/0xd0
    __x64_sys_finit_module+0x15/0x20
    do_syscall_64+0x6b/0x1d0
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x7fead031a839
    Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
    RSP: 002b:00007ffe8d9acca8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
    RAX: ffffffffffffffda RBX: 0000560078398b80 RCX: 00007fead031a839
    RDX: 0000000000000000 RSI: 000056007659dc2e RDI: 0000000000000003
    RBP: 000056007659dc2e R08: 0000000000000000 R09: 0000560078398b80
    R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000
    R13: 00005600783a04a0 R14: 0000000000040000 R15: 0000560078398b80
    Modules linked in: l2tp_core(+) e1000 ip_tables ipv6 [last unloaded: l2tp_core
    CR2: 0000000000000128
    ---[ end trace 8322b2b8bf83f8e1

    If alloc_workqueue fails in l2tp_init, l2tp_net_ops
    is unregistered on failure path. Then l2tp_exit_net
    is called which will flush NULL workqueue, this patch
    add a NULL check to fix it.

    Fixes: 67e04c29ec0d ("l2tp: unregister l2tp_net_ops on failure path")
    Signed-off-by: YueHaibing
    Acked-by: Guillaume Nault
    Signed-off-by: David S. Miller

    YueHaibing
     

03 May, 2019

1 commit


30 Apr, 2019

1 commit

  • Before taking a refcount on a rcu protected structure,
    we need to make sure the refcount is not zero.

    syzbot reported :

    refcount_t: increment on 0; use-after-free.
    WARNING: CPU: 1 PID: 23533 at lib/refcount.c:156 refcount_inc_checked lib/refcount.c:156 [inline]
    WARNING: CPU: 1 PID: 23533 at lib/refcount.c:156 refcount_inc_checked+0x61/0x70 lib/refcount.c:154
    Kernel panic - not syncing: panic_on_warn set ...
    CPU: 1 PID: 23533 Comm: syz-executor.2 Not tainted 5.1.0-rc7+ #93
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x172/0x1f0 lib/dump_stack.c:113
    panic+0x2cb/0x65c kernel/panic.c:214
    __warn.cold+0x20/0x45 kernel/panic.c:571
    report_bug+0x263/0x2b0 lib/bug.c:186
    fixup_bug arch/x86/kernel/traps.c:179 [inline]
    fixup_bug arch/x86/kernel/traps.c:174 [inline]
    do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
    do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
    invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
    RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline]
    RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154
    Code: 1d 98 2b 2a 06 31 ff 89 de e8 db 2c 40 fe 84 db 75 dd e8 92 2b 40 fe 48 c7 c7 20 7a a1 87 c6 05 78 2b 2a 06 01 e8 7d d9 12 fe 0b eb c1 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41
    RSP: 0018:ffff888069f0fba8 EFLAGS: 00010286
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
    RDX: 000000000000f353 RSI: ffffffff815afcb6 RDI: ffffed100d3e1f67
    RBP: ffff888069f0fbb8 R08: ffff88809b1845c0 R09: ffffed1015d23ef1
    R10: ffffed1015d23ef0 R11: ffff8880ae91f787 R12: ffff8880a8f26968
    R13: 0000000000000004 R14: dffffc0000000000 R15: ffff8880a49a6440
    l2tp_tunnel_inc_refcount net/l2tp/l2tp_core.h:240 [inline]
    l2tp_tunnel_get+0x250/0x580 net/l2tp/l2tp_core.c:173
    pppol2tp_connect+0xc00/0x1c70 net/l2tp/l2tp_ppp.c:702
    __sys_connect+0x266/0x330 net/socket.c:1808
    __do_sys_connect net/socket.c:1819 [inline]
    __se_sys_connect net/socket.c:1816 [inline]
    __x64_sys_connect+0x73/0xb0 net/socket.c:1816

    Fixes: 54652eb12c1b ("l2tp: hold tunnel while looking up sessions in l2tp_netlink")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Cc: Guillaume Nault
    Signed-off-by: David S. Miller

    Eric Dumazet
     

28 Apr, 2019

2 commits

  • Add options to strictly validate messages and dump messages,
    sometimes perhaps validating dump messages non-strictly may
    be required, so add an option for that as well.

    Since none of this can really be applied to existing commands,
    set the options everwhere using the following spatch:

    @@
    identifier ops;
    expression X;
    @@
    struct genl_ops ops[] = {
    ...,
    {
    .cmd = X,
    + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
    ...
    },
    ...
    };

    For new commands one should just not copy the .validate 'opt-out'
    flags and thus get strict validation.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
    netlink based interfaces (including recently added ones) are still not
    setting it in kernel generated messages. Without the flag, message parsers
    not aware of attribute semantics (e.g. wireshark dissector or libmnl's
    mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
    the structure of their contents.

    Unfortunately we cannot just add the flag everywhere as there may be
    userspace applications which check nlattr::nla_type directly rather than
    through a helper masking out the flags. Therefore the patch renames
    nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
    as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
    are rewritten to use nla_nest_start().

    Except for changes in include/net/netlink.h, the patch was generated using
    this semantic patch:

    @@ expression E1, E2; @@
    -nla_nest_start(E1, E2)
    +nla_nest_start_noflag(E1, E2)

    @@ expression E1, E2; @@
    -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
    +nla_nest_start(E1, E2)

    Signed-off-by: Michal Kubecek
    Acked-by: Jiri Pirko
    Acked-by: David Ahern
    Signed-off-by: David S. Miller

    Michal Kubecek
     

26 Apr, 2019

1 commit

  • Canonical way to fetch sk_user_data from an encap_rcv() handler called
    from UDP stack in rcu protected section is to use rcu_dereference_sk_user_data(),
    otherwise compiler might read it multiple times.

    Fixes: d00fa9adc528 ("il2tp: fix races with tunnel socket close")
    Signed-off-by: Eric Dumazet
    Cc: James Chapman
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Apr, 2019

1 commit

  • The SIOCGSTAMP/SIOCGSTAMPNS ioctl commands are implemented by many
    socket protocol handlers, and all of those end up calling the same
    sock_get_timestamp()/sock_get_timestampns() helper functions, which
    results in a lot of duplicate code.

    With the introduction of 64-bit time_t on 32-bit architectures, this
    gets worse, as we then need four different ioctl commands in each
    socket protocol implementation.

    To simplify that, let's add a new .gettstamp() operation in
    struct proto_ops, and move ioctl implementation into the common
    sock_ioctl()/compat_sock_ioctl_trans() functions that these all go
    through.

    We can reuse the sock_get_timestamp() implementation, but generalize
    it so it can deal with both native and compat mode, as well as
    timeval and timespec structures.

    Acked-by: Stefan Schmidt
    Acked-by: Neil Horman
    Acked-by: Marc Kleine-Budde
    Link: https://lore.kernel.org/lkml/CAK8P3a038aDQQotzua_QtKGhq8O9n+rdiz2=WDCp82ys8eUT+A@mail.gmail.com/
    Signed-off-by: Arnd Bergmann
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

19 Apr, 2019

1 commit

  • GCC complains:

    net/l2tp/l2tp_ppp.c: In function ‘pppol2tp_ioctl’:
    net/l2tp/l2tp_ppp.c:1073:6: warning: variable ‘val’ set but not used [-Wunused-but-set-variable]
    int val;
    ^~~

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Dirk van der Merwe
    Signed-off-by: David S. Miller

    Jakub Kicinski
     

22 Mar, 2019

1 commit

  • Since maxattr is common, the policy can't really differ sanely,
    so make it common as well.

    The only user that did in fact manage to make a non-common policy
    is taskstats, which has to be really careful about it (since it's
    still using a common maxattr!). This is no longer supported, but
    we can fake it using pre_doit.

    This reduces the size of e.g. nl80211.o (which has lots of commands):

    text data bss dec hex filename
    398745 14323 2240 415308 6564c net/wireless/nl80211.o (before)
    397913 14331 2240 414484 65314 net/wireless/nl80211.o (after)
    --------------------------------
    -832 +8 0 -824

    Which is obviously just 8 bytes for each command, and an added 8
    bytes for the new policy pointer. I'm not sure why the ops list is
    counted as .text though.

    Most of the code transformations were done using the following spatch:
    @ops@
    identifier OPS;
    expression POLICY;
    @@
    struct genl_ops OPS[] = {
    ...,
    {
    - .policy = POLICY,
    },
    ...
    };

    @@
    identifier ops.OPS;
    expression ops.POLICY;
    identifier fam;
    expression M;
    @@
    struct genl_family fam = {
    .ops = OPS,
    .maxattr = M,
    + .policy = POLICY,
    ...
    };

    This also gets rid of devlink_nl_cmd_region_read_dumpit() accessing
    the cb->data as ops, which we want to change in a later genl patch.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

14 Mar, 2019

1 commit

  • Back in 2013 Hannes took care of most of such leaks in commit
    bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls")

    But the bug in l2tp_ip6_recvmsg() has not been fixed.

    syzbot report :

    BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
    CPU: 1 PID: 10996 Comm: syz-executor362 Not tainted 5.0.0+ #11
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x173/0x1d0 lib/dump_stack.c:113
    kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:600
    kmsan_internal_check_memory+0x9f4/0xb10 mm/kmsan/kmsan.c:694
    kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601
    _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
    copy_to_user include/linux/uaccess.h:174 [inline]
    move_addr_to_user+0x311/0x570 net/socket.c:227
    ___sys_recvmsg+0xb65/0x1310 net/socket.c:2283
    do_recvmmsg+0x646/0x10c0 net/socket.c:2390
    __sys_recvmmsg net/socket.c:2469 [inline]
    __do_sys_recvmmsg net/socket.c:2492 [inline]
    __se_sys_recvmmsg+0x1d1/0x350 net/socket.c:2485
    __x64_sys_recvmmsg+0x62/0x80 net/socket.c:2485
    do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x63/0xe7
    RIP: 0033:0x445819
    Code: e8 6c b6 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 2b 12 fc ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007f64453eddb8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
    RAX: ffffffffffffffda RBX: 00000000006dac28 RCX: 0000000000445819
    RDX: 0000000000000005 RSI: 0000000020002f80 RDI: 0000000000000003
    RBP: 00000000006dac20 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dac2c
    R13: 00007ffeba8f87af R14: 00007f64453ee9c0 R15: 20c49ba5e353f7cf

    Local variable description: ----addr@___sys_recvmsg
    Variable was created at:
    ___sys_recvmsg+0xf6/0x1310 net/socket.c:2244
    do_recvmmsg+0x646/0x10c0 net/socket.c:2390

    Bytes 0-31 of 32 are uninitialized
    Memory access of size 32 starts at ffff8880ae62fbb0
    Data copied to user address 0000000020000000

    Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     

01 Feb, 2019

1 commit

  • The size of L2TPv2 header with all optional fields is 14 bytes.
    l2tp_udp_recv_core only moves 10 bytes to the linear part of a
    skb. This may lead to l2tp_recv_common read data outside of a skb.

    This patch make sure that there is at least 14 bytes in the linear
    part of a skb to meet the maximum need of l2tp_udp_recv_core and
    l2tp_recv_common. The minimum size of both PPP HDLC-like frame and
    Ethernet frame is larger than 14 bytes, so we are safe to do so.

    Also remove L2TP_HDR_SIZE_NOSEQ, it is unused now.

    Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
    Suggested-by: Guillaume Nault
    Signed-off-by: Jacob Wen
    Acked-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Jacob Wen
     

31 Jan, 2019

1 commit

  • Use pskb_may_pull() to make sure the optional fields are in skb linear
    parts, so we can safely read them later.

    It's easy to reproduce the issue with a net driver that supports paged
    skb data. Just create a L2TPv3 over IP tunnel and then generates some
    network traffic.
    Once reproduced, rx err in /sys/kernel/debug/l2tp/tunnels will increase.

    Changes in v4:
    1. s/l2tp_v3_pull_opt/l2tp_v3_ensure_opt_in_linear/
    2. s/tunnel->version != L2TP_HDR_VER_2/tunnel->version == L2TP_HDR_VER_3/
    3. Add 'Fixes' in commit messages.

    Changes in v3:
    1. To keep consistency, move the code out of l2tp_recv_common.
    2. Use "net" instead of "net-next", since this is a bug fix.

    Changes in v2:
    1. Only fix L2TPv3 to make code simple.
    To fix both L2TPv3 and L2TPv2, we'd better refactor l2tp_recv_common.
    It's complicated to do so.
    2. Reloading pointers after pskb_may_pull

    Fixes: f7faffa3ff8e ("l2tp: Add L2TPv3 protocol support")
    Fixes: 0d76751fad77 ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support")
    Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6")
    Signed-off-by: Jacob Wen
    Acked-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Jacob Wen
     

15 Nov, 2018

1 commit

  • This issue happens when trying to add an existent tunnel. It
    doesn't call sock_put() before returning -EEXIST to release
    the sock refcnt that was held by calling sock_hold() before
    the existence check.

    This patch is to fix it by holding the sock after doing the
    existence check.

    Fixes: f6cd651b056f ("l2tp: fix race in duplicate tunnel detection")
    Reported-by: Jianlin Shi
    Signed-off-by: Xin Long
    Reviewed-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Xin Long
     

14 Aug, 2018

1 commit

  • Removing one of the callers of pppol2tp_session_get_sock caused a harmless
    warning in some configurations:

    net/l2tp/l2tp_ppp.c:142:21: 'pppol2tp_session_get_sock' defined but not used [-Wunused-function]

    Rather than adding another #ifdef here, using a proper IS_ENABLED()
    check makes the code more readable and avoids those warnings while
    letting the compiler figure out for itself which code is needed.

    This adds one pointer for the unused show() callback in struct
    l2tp_session, but that seems harmless.

    Fixes: b0e29063dcb3 ("l2tp: remove pppol2tp_session_ioctl()")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

13 Aug, 2018

1 commit

  • In l2tp code, if it is a L2TP_UDP_ENCAP tunnel, tunnel->sk points to a
    UDP socket. User could call sendmsg() on both this tunnel and the UDP
    socket itself concurrently. As l2tp_xmit_skb() holds socket lock and call
    __sk_dst_check() to refresh sk->sk_dst_cache, while udpv6_sendmsg() is
    lockless and call sk_dst_check() to refresh sk->sk_dst_cache, there
    could be a race and cause the dst cache to be freed multiple times.
    So we fix l2tp side code to always call sk_dst_check() to garantee
    xchg() is called when refreshing sk->sk_dst_cache to avoid race
    conditions.

    Syzkaller reported stack trace:
    BUG: KASAN: use-after-free in atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
    BUG: KASAN: use-after-free in atomic_fetch_add_unless include/linux/atomic.h:575 [inline]
    BUG: KASAN: use-after-free in atomic_add_unless include/linux/atomic.h:597 [inline]
    BUG: KASAN: use-after-free in dst_hold_safe include/net/dst.h:308 [inline]
    BUG: KASAN: use-after-free in ip6_hold_safe+0xe6/0x670 net/ipv6/route.c:1029
    Read of size 4 at addr ffff8801aea9a880 by task syz-executor129/4829

    CPU: 0 PID: 4829 Comm: syz-executor129 Not tainted 4.18.0-rc7-next-20180802+ #30
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
    print_address_description+0x6c/0x20b mm/kasan/report.c:256
    kasan_report_error mm/kasan/report.c:354 [inline]
    kasan_report.cold.7+0x242/0x30d mm/kasan/report.c:412
    check_memory_region_inline mm/kasan/kasan.c:260 [inline]
    check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
    kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272
    atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
    atomic_fetch_add_unless include/linux/atomic.h:575 [inline]
    atomic_add_unless include/linux/atomic.h:597 [inline]
    dst_hold_safe include/net/dst.h:308 [inline]
    ip6_hold_safe+0xe6/0x670 net/ipv6/route.c:1029
    rt6_get_pcpu_route net/ipv6/route.c:1249 [inline]
    ip6_pol_route+0x354/0xd20 net/ipv6/route.c:1922
    ip6_pol_route_output+0x54/0x70 net/ipv6/route.c:2098
    fib6_rule_lookup+0x283/0x890 net/ipv6/fib6_rules.c:122
    ip6_route_output_flags+0x2c5/0x350 net/ipv6/route.c:2126
    ip6_dst_lookup_tail+0x1278/0x1da0 net/ipv6/ip6_output.c:978
    ip6_dst_lookup_flow+0xc8/0x270 net/ipv6/ip6_output.c:1079
    ip6_sk_dst_lookup_flow+0x5ed/0xc50 net/ipv6/ip6_output.c:1117
    udpv6_sendmsg+0x2163/0x36b0 net/ipv6/udp.c:1354
    inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
    sock_sendmsg_nosec net/socket.c:622 [inline]
    sock_sendmsg+0xd5/0x120 net/socket.c:632
    ___sys_sendmsg+0x51d/0x930 net/socket.c:2115
    __sys_sendmmsg+0x240/0x6f0 net/socket.c:2210
    __do_sys_sendmmsg net/socket.c:2239 [inline]
    __se_sys_sendmmsg net/socket.c:2236 [inline]
    __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2236
    do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x446a29
    Code: e8 ac b8 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007f4de5532db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
    RAX: ffffffffffffffda RBX: 00000000006dcc38 RCX: 0000000000446a29
    RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003
    RBP: 00000000006dcc30 R08: 00007f4de5533700 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dcc3c
    R13: 00007ffe2b830fdf R14: 00007f4de55339c0 R15: 0000000000000001

    Fixes: 71b1391a4128 ("l2tp: ensure sk->dst is still valid")
    Reported-by: syzbot+05f840f3b04f211bad55@syzkaller.appspotmail.com
    Signed-off-by: Wei Wang
    Signed-off-by: Martin KaFai Lau
    Cc: Guillaume Nault
    Cc: David Ahern
    Cc: Cong Wang
    Signed-off-by: David S. Miller

    Wei Wang
     

12 Aug, 2018

8 commits

  • Return -ENOIOCTLCMD for unknown ioctl commands. This lets dev_ioctl()
    handle generic socket ioctls like SIOCGIFNAME or SIOCGIFINDEX.
    PF_PPPOX/PX_PROTO_OL2TP was one of the few socket types not honouring
    this mechanism.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • Integrate memset(0) in pppol2tp_copy_stats() to avoid calling it
    manually every time.

    While there, constify 'stats'.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • pppol2tp_ioctl() has everything in place for handling PPPIOCGL2TPSTATS
    on session sockets. We just need to copy the stats and set ->session_id.

    As a side effect of sharing session and tunnel code, ->using_ipsec is
    properly set even when the request was made using a session socket.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • Handle PPPIOCGL2TPSTATS in pppol2tp_ioctl() if the socket represents a
    tunnel. This one is a bit special because the caller may use the tunnel
    socket to retrieve statistics of one of its sessions. If the session_id
    is set, the corresponding session's statistics are returned, instead of
    those of the tunnel. This is handled by the new
    pppol2tp_tunnel_copy_stats() helper function.

    Set ->tunnel_id and ->using_ipsec out of the conditional, so
    that it can be used by the 'else' branch in the following patch.
    We cannot do that for ->session_id, because tunnel sockets have to
    report the value that was originally passed in 'stats.session_id',
    while session sockets have to report their own session_id.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • Let pppol2tp_ioctl() handle ioctl commands directly. It still relies on
    pppol2tp_{session,tunnel}_ioctl() for PPPIOCGL2TPSTATS.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • * Drop test on 'sk': sock->sk cannot be NULL, or pppox_ioctl() could
    not have called us.

    * Drop test on 'SOCK_DEAD' state: if this flag was set, the socket
    would be in the process of being released and no ioctl could be
    running anymore.

    * Drop test on 'PPPOX_*' state: we depend on ->sk_user_data to get
    the session structure. If it is non-NULL, then the socket is
    connected. Testing for PPPOX_* is redundant.

    * Retrieve session using ->sk_user_data directly, instead of going
    through pppol2tp_sock_to_session(). This avoids grabbing a useless
    reference on the socket.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • l2tp_session_get() is used for two different purposes. If 'tunnel' is
    NULL, the session is searched globally in the supplied network
    namespace. Otherwise it is searched exclusively in the tunnel context.

    Callers always know the context in which they need to search the
    session. But some of them do provide both a namespace and a tunnel,
    making the semantic of the call unclear.

    This patch defines l2tp_tunnel_get_session() for lookups done in a
    tunnel and restricts l2tp_session_get() to namespace searches.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • Use helper function to figure out if a tunnel is using ipsec.
    Also, avoid accessing ->sk_policy directly since it's RCU protected.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     

06 Aug, 2018

1 commit


04 Aug, 2018

4 commits

  • If 'session' is not NULL and is not a PPP pseudo-wire, then we fail to
    drop the reference taken by l2tp_session_get().

    Fixes: ecd012e45ab5 ("l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()")
    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • This attribute's handling is broken. It can only be used when creating
    Ethernet pseudo-wires, in which case its value can be used as the
    initial MTU for the l2tpeth device.
    However, when handling update requests, L2TP_ATTR_MTU only modifies
    session->mtu. This value is never propagated to the l2tpeth device.
    Dump requests also return the value of session->mtu, which is not
    synchronised anymore with the device MTU.

    The same problem occurs if the device MTU is properly updated using the
    generic IFLA_MTU attribute. In this case, session->mtu is not updated,
    and L2TP_ATTR_MTU will report an invalid value again when dumping the
    session.

    It does not seem worthwhile to complexify l2tp_eth.c to synchronise
    session->mtu with the device MTU. Even the ip-l2tp manpage advises to
    use 'ip link' to initialise the MTU of l2tpeth devices (iproute2 does
    not handle L2TP_ATTR_MTU at all anyway). So let's just ignore it
    entirely.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • The value of the session's .mtu field, as defined by
    pppol2tp_connect() or pppol2tp_session_create(), is later overwritten
    by pppol2tp_session_init() (unless getting the tunnel's socket PMTU
    fails). This field is then only used when setting the PPP channel's MTU
    in pppol2tp_connect().
    Furthermore, the SIOC[GS]IFMTU ioctls only act on the session's .mtu
    without propagating this value to the PPP channel, making them useless.

    This patch initialises the PPP channel's MTU directly and ignores the
    session's .mtu entirely. MTU is still computed by subtracting the
    PPPOL2TP_HEADER_OVERHEAD constant. It is not optimal, but that doesn't
    really matter: po->chan.mtu is only used when the channel is part of a
    multilink PPP bundle. Running multilink PPP over packet switched
    networks is certainly not going to be efficient, so not picking the
    best MTU does not harm (in the worst case, packets will just be
    fragmented by the underlay).

    The SIOC[GS]IFMTU ioctls are removed entirely (as opposed to simply
    ignored), because these ioctls commands are part of the requests that
    should be handled generically by the socket layer. PX_PROTO_OL2TP was
    the only socket type abusing these ioctls.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • Consolidate retrieval of tunnel's socket mtu in order to simplify
    l2tp_eth and l2tp_ppp a bit.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault