07 Feb, 2019

2 commits

  • Now that we have a dedicated NDO for getting a port's parent ID, get rid
    of SWITCHDEV_ATTR_ID_PORT_PARENT_ID and convert all callers to use the
    NDO exclusively. This is a preliminary change to getting rid of
    switchdev_ops eventually.

    Signed-off-by: Florian Fainelli
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • In preparation for getting rid of switchdev_ops, create a dedicated NDO
    operation for getting the port's parent identifier. There are
    essentially two classes of drivers that need to implement getting the
    port's parent ID which are VF/PF drivers with a built-in switch, and
    pure switchdev drivers such as mlxsw, ocelot, dsa etc.

    We introduce a helper function: dev_get_port_parent_id() which supports
    recursion into the lower devices to obtain the first port's parent ID.

    Convert the bridge, core and ipv4 multicast routing code to check for
    such ndo_get_port_parent_id() and call the helper function when valid
    before falling back to switchdev_port_attr_get(). This will allow us to
    convert all relevant drivers in one go instead of having to implement
    both switchdev_port_attr_get() and ndo_get_port_parent_id() operations,
    then get rid of switchdev_port_attr_get().

    Acked-by: Jiri Pirko
    Signed-off-by: Florian Fainelli
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Florian Fainelli
     

20 Jan, 2019

3 commits


18 Jan, 2019

1 commit

  • Drivers may not be able to support certain FDB entries, and an error
    code is insufficient to give clear hints as to the reasons of rejection.

    In order to make it possible to communicate the rejection reason, extend
    ndo_fdb_add() with an extack argument. Adapt the existing
    implementations of ndo_fdb_add() to take the parameter (and ignore it).
    Pass the extack parameter when invoking ndo_fdb_add() from rtnl_fdb_add().

    Signed-off-by: Petr Machata
    Signed-off-by: David S. Miller

    Petr Machata
     

31 Dec, 2018

1 commit

  • We must have an address to lookup otherwise we'll derefence a null
    pointer in the ndo_fdb_get callbacks.

    CC: Roopa Prabhu
    CC: David Ahern
    Reported-by: syzbot+017b1f61c82a1c3e7efd@syzkaller.appspotmail.com
    Fixes: 5b2f94b27622 ("net: rtnetlink: support for fdb get")
    Signed-off-by: Nikolay Aleksandrov
    Acked-by: Roopa Prabhu
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

20 Dec, 2018

1 commit

  • this patch registers neigh doit handler. The doit handler
    returns a neigh entry given dst and dev. This is similar
    to route and fdb doit (get) handlers. Also moves nda_policy
    declaration from rtnetlink.c to neighbour.c

    Signed-off-by: Roopa Prabhu
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Roopa Prabhu
     

17 Dec, 2018

1 commit

  • This patch adds support for fdb get similar to
    route get. arguments can be any of the following (similar to fdb add/del/dump):
    [bridge, mac, vlan] or
    [bridge_port, mac, vlan, flags=[NTF_MASTER]] or
    [dev, mac, [vni|vlan], flags=[NTF_SELF]]

    Signed-off-by: Roopa Prabhu
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Roopa Prabhu
     

14 Dec, 2018

1 commit

  • A follow-up patch will add a notifier type NETDEV_PRE_CHANGEADDR, which
    allows vetoing of MAC address changes. One prominent path to that
    notification is through dev_set_mac_address(). Therefore give this
    function an extack argument, so that it can be packed together with the
    notification. Thus a textual reason for rejection (or a warning) can be
    communicated back to the user.

    Signed-off-by: Petr Machata
    Acked-by: Jiri Pirko
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Petr Machata
     

13 Dec, 2018

1 commit

  • Drivers may not be able to implement a VLAN addition or reconfiguration.
    In those cases it's desirable to explain to the user that it was
    rejected (and why).

    To that end, add extack argument to ndo_bridge_setlink. Adapt all users
    to that change.

    Following patches will use the new argument in the bridge driver.

    Signed-off-by: Petr Machata
    Acked-by: Jiri Pirko
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Petr Machata
     

10 Dec, 2018

1 commit

  • Several conflicts, seemingly all over the place.

    I used Stephen Rothwell's sample resolutions for many of these, if not
    just to double check my own work, so definitely the credit largely
    goes to him.

    The NFP conflict consisted of a bug fix (moving operations
    past the rhashtable operation) while chaning the initial
    argument in the function call in the moved code.

    The net/dsa/master.c conflict had to do with a bug fix intermixing of
    making dsa_master_set_mtu() static with the fixing of the tagging
    attribute location.

    cls_flower had a conflict because the dup reject fix from Or
    overlapped with the addition of port range classifiction.

    __set_phy_supported()'s conflict was relatively easy to resolve
    because Andrew fixed it in both trees, so it was just a matter
    of taking the net-next copy. Or at least I think it was :-)

    Joe Stringer's fix to the handling of netns id 0 in bpf_sk_lookup()
    intermixed with changes on how the sdif and caller_net are calculated
    in these code paths in net-next.

    The remaining BPF conflicts were largely about the addition of the
    __bpf_md_ptr stuff in 'net' overlapping with adjustments and additions
    to the relevant data structure where the MD pointer macros are used.

    Signed-off-by: David S. Miller

    David S. Miller
     

07 Dec, 2018

2 commits

  • In order to pass extack together with NETDEV_PRE_UP notifications, it's
    necessary to route the extack to __dev_open() from diverse (possibly
    indirect) callers. The last missing API is __dev_change_flags().

    Therefore extend __dev_change_flags() with and extra extack argument and
    update the two existing users.

    Since the function declaration line is changed anyway, name the struct
    net_device argument to placate checkpatch.

    Signed-off-by: Petr Machata
    Acked-by: Jiri Pirko
    Reviewed-by: Ido Schimmel
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Petr Machata
     
  • In order to pass extack together with NETDEV_PRE_UP notifications, it's
    necessary to route the extack to __dev_open() from diverse (possibly
    indirect) callers. One prominent API through which the notification is
    invoked is dev_change_flags().

    Therefore extend dev_change_flags() with and extra extack argument and
    update all users. Most of the calls end up just encoding NULL, but
    several sites (VLAN, ipvlan, VRF, rtnetlink) do have extack available.

    Since the function declaration line is changed anyway, name the other
    function arguments to placate checkpatch.

    Signed-off-by: Petr Machata
    Acked-by: Jiri Pirko
    Reviewed-by: Ido Schimmel
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Petr Machata
     

05 Dec, 2018

1 commit

  • kmsan was able to trigger a kernel-infoleak using a gre device [1]

    nlmsg_populate_fdb_fill() has a hard coded assumption
    that dev->addr_len is ETH_ALEN, as normally guaranteed
    for ARPHRD_ETHER devices.

    A similar issue was fixed recently in commit da71577545a5
    ("rtnetlink: Disallow FDB configuration for non-Ethernet device")

    [1]
    BUG: KMSAN: kernel-infoleak in copyout lib/iov_iter.c:143 [inline]
    BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4c0/0x2700 lib/iov_iter.c:576
    CPU: 0 PID: 6697 Comm: syz-executor310 Not tainted 4.20.0-rc3+ #95
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x32d/0x480 lib/dump_stack.c:113
    kmsan_report+0x12c/0x290 mm/kmsan/kmsan.c:683
    kmsan_internal_check_memory+0x32a/0xa50 mm/kmsan/kmsan.c:743
    kmsan_copy_to_user+0x78/0xd0 mm/kmsan/kmsan_hooks.c:634
    copyout lib/iov_iter.c:143 [inline]
    _copy_to_iter+0x4c0/0x2700 lib/iov_iter.c:576
    copy_to_iter include/linux/uio.h:143 [inline]
    skb_copy_datagram_iter+0x4e2/0x1070 net/core/datagram.c:431
    skb_copy_datagram_msg include/linux/skbuff.h:3316 [inline]
    netlink_recvmsg+0x6f9/0x19d0 net/netlink/af_netlink.c:1975
    sock_recvmsg_nosec net/socket.c:794 [inline]
    sock_recvmsg+0x1d1/0x230 net/socket.c:801
    ___sys_recvmsg+0x444/0xae0 net/socket.c:2278
    __sys_recvmsg net/socket.c:2327 [inline]
    __do_sys_recvmsg net/socket.c:2337 [inline]
    __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334
    __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334
    do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x63/0xe7
    RIP: 0033:0x441119
    Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 db 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007fffc7f008a8 EFLAGS: 00000207 ORIG_RAX: 000000000000002f
    RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000441119
    RDX: 0000000000000040 RSI: 00000000200005c0 RDI: 0000000000000003
    RBP: 00000000006cc018 R08: 0000000000000100 R09: 0000000000000100
    R10: 0000000000000100 R11: 0000000000000207 R12: 0000000000402080
    R13: 0000000000402110 R14: 0000000000000000 R15: 0000000000000000

    Uninit was stored to memory at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:246 [inline]
    kmsan_save_stack mm/kmsan/kmsan.c:261 [inline]
    kmsan_internal_chain_origin+0x13d/0x240 mm/kmsan/kmsan.c:469
    kmsan_memcpy_memmove_metadata+0x1a9/0xf70 mm/kmsan/kmsan.c:344
    kmsan_memcpy_metadata+0xb/0x10 mm/kmsan/kmsan.c:362
    __msan_memcpy+0x61/0x70 mm/kmsan/kmsan_instr.c:162
    __nla_put lib/nlattr.c:744 [inline]
    nla_put+0x20a/0x2d0 lib/nlattr.c:802
    nlmsg_populate_fdb_fill+0x444/0x810 net/core/rtnetlink.c:3466
    nlmsg_populate_fdb net/core/rtnetlink.c:3775 [inline]
    ndo_dflt_fdb_dump+0x73a/0x960 net/core/rtnetlink.c:3807
    rtnl_fdb_dump+0x1318/0x1cb0 net/core/rtnetlink.c:3979
    netlink_dump+0xc79/0x1c90 net/netlink/af_netlink.c:2244
    __netlink_dump_start+0x10c4/0x11d0 net/netlink/af_netlink.c:2352
    netlink_dump_start include/linux/netlink.h:216 [inline]
    rtnetlink_rcv_msg+0x141b/0x1540 net/core/rtnetlink.c:4910
    netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2477
    rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4965
    netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
    netlink_unicast+0x1699/0x1740 net/netlink/af_netlink.c:1336
    netlink_sendmsg+0x13c7/0x1440 net/netlink/af_netlink.c:1917
    sock_sendmsg_nosec net/socket.c:621 [inline]
    sock_sendmsg net/socket.c:631 [inline]
    ___sys_sendmsg+0xe3b/0x1240 net/socket.c:2116
    __sys_sendmsg net/socket.c:2154 [inline]
    __do_sys_sendmsg net/socket.c:2163 [inline]
    __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
    __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
    do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x63/0xe7

    Uninit was created at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:246 [inline]
    kmsan_internal_poison_shadow+0x6d/0x130 mm/kmsan/kmsan.c:170
    kmsan_kmalloc+0xa1/0x100 mm/kmsan/kmsan_hooks.c:186
    __kmalloc+0x14c/0x4d0 mm/slub.c:3825
    kmalloc include/linux/slab.h:551 [inline]
    __hw_addr_create_ex net/core/dev_addr_lists.c:34 [inline]
    __hw_addr_add_ex net/core/dev_addr_lists.c:80 [inline]
    __dev_mc_add+0x357/0x8a0 net/core/dev_addr_lists.c:670
    dev_mc_add+0x6d/0x80 net/core/dev_addr_lists.c:687
    ip_mc_filter_add net/ipv4/igmp.c:1128 [inline]
    igmp_group_added+0x4d4/0xb80 net/ipv4/igmp.c:1311
    __ip_mc_inc_group+0xea9/0xf70 net/ipv4/igmp.c:1444
    ip_mc_inc_group net/ipv4/igmp.c:1453 [inline]
    ip_mc_up+0x1c3/0x400 net/ipv4/igmp.c:1775
    inetdev_event+0x1d03/0x1d80 net/ipv4/devinet.c:1522
    notifier_call_chain kernel/notifier.c:93 [inline]
    __raw_notifier_call_chain kernel/notifier.c:394 [inline]
    raw_notifier_call_chain+0x13d/0x240 kernel/notifier.c:401
    __dev_notify_flags+0x3da/0x860 net/core/dev.c:1733
    dev_change_flags+0x1ac/0x230 net/core/dev.c:7569
    do_setlink+0x165f/0x5ea0 net/core/rtnetlink.c:2492
    rtnl_newlink+0x2ad7/0x35a0 net/core/rtnetlink.c:3111
    rtnetlink_rcv_msg+0x1148/0x1540 net/core/rtnetlink.c:4947
    netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2477
    rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4965
    netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
    netlink_unicast+0x1699/0x1740 net/netlink/af_netlink.c:1336
    netlink_sendmsg+0x13c7/0x1440 net/netlink/af_netlink.c:1917
    sock_sendmsg_nosec net/socket.c:621 [inline]
    sock_sendmsg net/socket.c:631 [inline]
    ___sys_sendmsg+0xe3b/0x1240 net/socket.c:2116
    __sys_sendmsg net/socket.c:2154 [inline]
    __do_sys_sendmsg net/socket.c:2163 [inline]
    __se_sys_sendmsg+0x305/0x460 net/socket.c:2161
    __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161
    do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x63/0xe7

    Bytes 36-37 of 105 are uninitialized
    Memory access of size 105 starts at ffff88819686c000
    Data copied to user address 0000000020000380

    Fixes: d83b06036048 ("net: add fdb generic dump routine")
    Signed-off-by: Eric Dumazet
    Cc: John Fastabend
    Cc: Ido Schimmel
    Cc: David Ahern
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Eric Dumazet
     

01 Dec, 2018

2 commits

  • Standard kernel compilation produces the following warning:

    net/core/rtnetlink.c: In function ‘rtnl_newlink’:
    net/core/rtnetlink.c:3232:1: warning: the frame size of 1288 bytes is larger than 1024 bytes [-Wframe-larger-than=]
    }
    ^

    This should not really be an issue, as rtnl_newlink() stack is
    generally quite shallow.

    Fix the warning by allocating attributes with kmalloc() in a wrapper
    and passing it down to rtnl_newlink(), avoiding complexities on error
    paths.

    Alternatively we could kmalloc() some structure within rtnl_newlink(),
    slave attributes look like a good candidate. In practice it adds to
    already rather high complexity and length of the function.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Jakub Kicinski
     
  • rtnl_newlink() used to create VLAs based on link kind. Since
    commit ccf8dbcd062a ("rtnetlink: Remove VLA usage") statically
    sized array is created on the stack, so there is no more use
    for a separate code block that used to be the VLA's live range.

    While at it christmas tree the variables. Note that there is
    a goto-based retry so to be on the safe side the variables can
    no longer be initialized in place. It doesn't seem to matter,
    logically, but why make the code harder to read..

    Signed-off-by: Jakub Kicinski
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Jakub Kicinski
     

28 Nov, 2018

1 commit

  • We have been adding many new bridge options, a big number of which are
    boolean but still take up netlink attribute ids and waste space in the skb.
    Recently we discussed learning from link-local packets[1] and decided
    yet another new boolean option will be needed, thus introducing this API
    to save some bridge nl space.
    The API supports changing the value of multiple boolean options at once
    via the br_boolopt_multi struct which has an optmask (which options to
    set, bit per opt) and optval (options' new values). Future boolean
    options will only be added to the br_boolopt_id enum and then will have
    to be handled in br_boolopt_toggle/get. The API will automatically
    add the ability to change and export them via netlink, sysfs can use the
    single boolopt function versions to do the same. The behaviour with
    failing/succeeding is the same as with normal netlink option changing.

    If an option requires mapping to internal kernel flag or needs special
    configuration to be enabled then it should be handled in
    br_boolopt_toggle. It should also be able to retrieve an option's current
    state via br_boolopt_get.

    v2: WARN_ON() on unsupported option as that shouldn't be possible and
    also will help catch people who add new options without handling
    them for both set and get. Pass down extack so if an option desires
    it could set it on error and be more user-friendly.

    [1] https://www.spinics.net/lists/netdev/msg532698.html

    Signed-off-by: Nikolay Aleksandrov
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

07 Nov, 2018

2 commits

  • Add extack arg to the nla_parse_nested calls in rtnl_newlink, and
    add messages for unknown device type and link network namespace id.
    In particular, it improves the failure message when the wrong link
    type is used. From
    $ ip li add bond1 type bonding
    RTNETLINK answers: Operation not supported
    to
    $ ip li add bond1 type bonding
    Error: Unknown device type.

    (The module name is bonding but the link type is bond.)

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Add extack arg to rtnl_create_link and add messages for invalid
    number of Tx or Rx queues.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

06 Nov, 2018

1 commit

  • For non-zero return from dumpit() we should break the loop
    in rtnl_dump_all() and return the result. Otherwise, e.g.,
    we could get the memory leak in inet6_dump_fib() [1]. The
    pointer to the allocated struct fib6_walker there (saved
    in cb->args) can be lost, reset on the next iteration.

    Fix it by partially restoring the previous behavior before
    commit c63586dc9b3e ("net: rtnl_dump_all needs to propagate
    error from dumpit function"). The returned error from
    dumpit() is still passed further.

    [1]:
    unreferenced object 0xffff88001322a200 (size 96):
    comm "sshd", pid 1484, jiffies 4296032768 (age 1432.542s)
    hex dump (first 32 bytes):
    00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de ................
    18 09 41 36 00 88 ff ff 18 09 41 36 00 88 ff ff ..A6......A6....
    backtrace:
    [] kmem_cache_alloc_trace+0x151/0x220
    [] inet6_dump_fib+0x68d/0x940
    [] rtnl_dump_all+0x1d9/0x2d0
    [] netlink_dump+0x945/0x11a0
    [] __netlink_dump_start+0x55d/0x800
    [] rtnetlink_rcv_msg+0x4fa/0xa00
    [] netlink_rcv_skb+0x29c/0x420
    [] rtnetlink_rcv+0x15/0x20
    [] netlink_unicast+0x4e3/0x6c0
    [] netlink_sendmsg+0x7f2/0xba0
    [] sock_sendmsg+0xba/0xf0
    [] __sys_sendto+0x1e4/0x330
    [] __x64_sys_sendto+0xe1/0x1a0
    [] do_syscall_64+0x9f/0x300
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [] 0xffffffffffffffff

    Fixes: c63586dc9b3e ("net: rtnl_dump_all needs to propagate error from dumpit function")
    Signed-off-by: Alexey Kodanev
    Signed-off-by: David S. Miller

    Alexey Kodanev
     

30 Oct, 2018

1 commit

  • When an FDB entry is configured, the address is validated to have the
    length of an Ethernet address, but the device for which the address is
    configured can be of any type.

    The above can result in the use of uninitialized memory when the address
    is later compared against existing addresses since 'dev->addr_len' is
    used and it may be greater than ETH_ALEN, as with ip6tnl devices.

    Fix this by making sure that FDB entries are only configured for
    Ethernet devices.

    BUG: KMSAN: uninit-value in memcmp+0x11d/0x180 lib/string.c:863
    CPU: 1 PID: 4318 Comm: syz-executor998 Not tainted 4.19.0-rc3+ #49
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x14b/0x190 lib/dump_stack.c:113
    kmsan_report+0x183/0x2b0 mm/kmsan/kmsan.c:956
    __msan_warning+0x70/0xc0 mm/kmsan/kmsan_instr.c:645
    memcmp+0x11d/0x180 lib/string.c:863
    dev_uc_add_excl+0x165/0x7b0 net/core/dev_addr_lists.c:464
    ndo_dflt_fdb_add net/core/rtnetlink.c:3463 [inline]
    rtnl_fdb_add+0x1081/0x1270 net/core/rtnetlink.c:3558
    rtnetlink_rcv_msg+0xa0b/0x1530 net/core/rtnetlink.c:4715
    netlink_rcv_skb+0x36e/0x5f0 net/netlink/af_netlink.c:2454
    rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4733
    netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
    netlink_unicast+0x1638/0x1720 net/netlink/af_netlink.c:1343
    netlink_sendmsg+0x1205/0x1290 net/netlink/af_netlink.c:1908
    sock_sendmsg_nosec net/socket.c:621 [inline]
    sock_sendmsg net/socket.c:631 [inline]
    ___sys_sendmsg+0xe70/0x1290 net/socket.c:2114
    __sys_sendmsg net/socket.c:2152 [inline]
    __do_sys_sendmsg net/socket.c:2161 [inline]
    __se_sys_sendmsg+0x2a3/0x3d0 net/socket.c:2159
    __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2159
    do_syscall_64+0xb8/0x100 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x63/0xe7
    RIP: 0033:0x440ee9
    Code: e8 cc ab 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7
    48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff
    ff 0f 83 bb 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007fff6a93b518 EFLAGS: 00000213 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000440ee9
    RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000003
    RBP: 0000000000000000 R08: 00000000004002c8 R09: 00000000004002c8
    R10: 00000000004002c8 R11: 0000000000000213 R12: 000000000000b4b0
    R13: 0000000000401ec0 R14: 0000000000000000 R15: 0000000000000000

    Uninit was created at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:256 [inline]
    kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:181
    kmsan_kmalloc+0x98/0x100 mm/kmsan/kmsan_hooks.c:91
    kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:100
    slab_post_alloc_hook mm/slab.h:446 [inline]
    slab_alloc_node mm/slub.c:2718 [inline]
    __kmalloc_node_track_caller+0x9e7/0x1160 mm/slub.c:4351
    __kmalloc_reserve net/core/skbuff.c:138 [inline]
    __alloc_skb+0x2f5/0x9e0 net/core/skbuff.c:206
    alloc_skb include/linux/skbuff.h:996 [inline]
    netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline]
    netlink_sendmsg+0xb49/0x1290 net/netlink/af_netlink.c:1883
    sock_sendmsg_nosec net/socket.c:621 [inline]
    sock_sendmsg net/socket.c:631 [inline]
    ___sys_sendmsg+0xe70/0x1290 net/socket.c:2114
    __sys_sendmsg net/socket.c:2152 [inline]
    __do_sys_sendmsg net/socket.c:2161 [inline]
    __se_sys_sendmsg+0x2a3/0x3d0 net/socket.c:2159
    __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2159
    do_syscall_64+0xb8/0x100 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x63/0xe7

    v2:
    * Make error message more specific (David)

    Fixes: 090096bf3db1 ("net: generic fdb support for drivers without ndo_fdb_")
    Signed-off-by: Ido Schimmel
    Reported-and-tested-by: syzbot+3a288d5f5530b901310e@syzkaller.appspotmail.com
    Reported-and-tested-by: syzbot+d53ab4e92a1db04110ff@syzkaller.appspotmail.com
    Cc: Vlad Yasevich
    Cc: David Ahern
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Ido Schimmel
     

25 Oct, 2018

1 commit

  • If an address, route or netconf dump request is sent for AF_UNSPEC, then
    rtnl_dump_all is used to do the dump across all address families. If one
    of the dumpit functions fails (e.g., invalid attributes in the dump
    request) then rtnl_dump_all needs to propagate that error so the user
    gets an appropriate response instead of just getting no data.

    Fixes: effe67926624 ("net: Enable kernel side filtering of route dumps")
    Fixes: 5fcd266a9f64 ("net/ipv4: Add support for dumping addresses for a specific device")
    Fixes: 6371a71f3a3b ("net/ipv6: Add support for dumping addresses for a specific device")
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

13 Oct, 2018

1 commit

  • This patch adds an option to have per-port vlan stats instead of the
    default global stats. The option can be set only when there are no port
    vlans in the bridge since we need to allocate the stats if it is set
    when vlans are being added to ports (and respectively free them
    when being deleted). Also bump RTNL_MAX_TYPE as the bridge is the
    largest user of options. The current stats design allows us to add
    these without any changes to the fast-path, it all comes down to
    the per-vlan stats pointer which, if this option is enabled, will
    be allocated for each port vlan instead of using the global bridge-wide
    one.

    CC: bridge@lists.linux-foundation.org
    CC: Roopa Prabhu
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

11 Oct, 2018

2 commits


09 Oct, 2018

6 commits

  • Update rtnl_fdb_dump for strict data checking. If the flag is set,
    the dump request is expected to have an ndmsg struct as the header
    potentially followed by one or more attributes. Any data passed in the
    header or as an attribute is taken as a request to influence the data
    returned. Only values supported by the dump handler are allowed to be
    non-0 or set in the request. At the moment only the NDA_IFINDEX and
    NDA_MASTER attributes are supported.

    Signed-off-by: David Ahern
    Acked-by: Christian Brauner
    Signed-off-by: David S. Miller

    David Ahern
     
  • Move the existing input checking for rtnl_fdb_dump into a helper,
    valid_fdb_dump_legacy. This function will retain the current
    logic that works around the 2 headers that userspace has been
    allowed to send up to this point.

    Signed-off-by: David Ahern
    Acked-by: Christian Brauner
    Signed-off-by: David S. Miller

    David Ahern
     
  • Update rtnl_stats_dump for strict data checking. If the flag is set,
    the dump request is expected to have an if_stats_msg struct as the header.
    All elements of the struct are expected to be 0 except filter_mask which
    must be non-0 (legacy behavior). No attributes are supported.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Update rtnl_bridge_getlink for strict data checking. If the flag is set,
    the dump request is expected to have an ifinfomsg struct as the header
    potentially followed by one or more attributes. Any data passed in the
    header or as an attribute is taken as a request to influence the data
    returned. Only values supported by the dump handler are allowed to be
    non-0 or set in the request. At the moment only the IFLA_EXT_MASK
    attribute is supported.

    Signed-off-by: David Ahern
    Acked-by: Christian Brauner
    Signed-off-by: David S. Miller

    David Ahern
     
  • Update rtnl_dump_ifinfo for strict data checking. If the flag is set,
    the dump request is expected to have an ifinfomsg struct as the header
    potentially followed by one or more attributes. Any data passed in the
    header or as an attribute is taken as a request to influence the data
    returned. Only values supported by the dump handler are allowed to be
    non-0 or set in the request. At the moment only the IFA_TARGET_NETNSID,
    IFLA_EXT_MASK, IFLA_MASTER, and IFLA_LINKINFO attributes are supported.

    Existing code does not fail the dump if nlmsg_parse fails. That behavior
    is kept for non-strict checking.

    Signed-off-by: David Ahern
    Acked-by: Christian Brauner
    Signed-off-by: David S. Miller

    David Ahern
     
  • Make sure extack is passed to nlmsg_parse where easy to do so.
    Most of these are dump handlers and leveraging the extack in
    the netlink_callback.

    Signed-off-by: David Ahern
    Acked-by: Christian Brauner
    Signed-off-by: David S. Miller

    David Ahern
     

07 Oct, 2018

1 commit


06 Oct, 2018

1 commit

  • Currently, rtnl_fdb_dump() assumes the family header is 'struct ifinfomsg',
    which is not always true -- 'struct ndmsg' is used by iproute2 ('ip neigh').

    The problem is, the function bails out early if nlmsg_parse() fails, which
    does occur for iproute2 usage of 'struct ndmsg' because the payload length
    is shorter than the family header alone (as 'struct ifinfomsg' is assumed).

    This breaks backward compatibility with userspace -- nothing is sent back.

    Some examples with iproute2 and netlink library for go [1]:

    1) $ bridge fdb show
    33:33:00:00:00:01 dev ens3 self permanent
    01:00:5e:00:00:01 dev ens3 self permanent
    33:33:ff:15:98:30 dev ens3 self permanent

    This one works, as it uses 'struct ifinfomsg'.

    fdb_show() @ iproute2/bridge/fdb.c
    """
    .n.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
    ...
    if (rtnl_dump_request(&rth, RTM_GETNEIGH, [...]
    """

    2) $ ip --family bridge neigh
    RTNETLINK answers: Invalid argument
    Dump terminated

    This one fails, as it uses 'struct ndmsg'.

    do_show_or_flush() @ iproute2/ip/ipneigh.c
    """
    .n.nlmsg_type = RTM_GETNEIGH,
    .n.nlmsg_len = NLMSG_LENGTH(sizeof(struct ndmsg)),
    """

    3) $ ./neighlist
    < no output >

    This one fails, as it uses 'struct ndmsg'-based.

    neighList() @ netlink/neigh_linux.go
    """
    req := h.newNetlinkRequest(unix.RTM_GETNEIGH, [...]
    msg := Ndmsg{
    """

    The actual breakage was introduced by commit 0ff50e83b512 ("net: rtnetlink:
    bail out from rtnl_fdb_dump() on parse error"), because nlmsg_parse() fails
    if the payload length (with the _actual_ family header) is less than the
    family header length alone (which is assumed, in parameter 'hdrlen').
    This is true in the examples above with struct ndmsg, with size and payload
    length shorter than struct ifinfomsg.

    However, that commit just intends to fix something under the assumption the
    family header is indeed an 'struct ifinfomsg' - by preventing access to the
    payload as such (via 'ifm' pointer) if the payload length is not sufficient
    to actually contain it.

    The assumption was introduced by commit 5e6d24358799 ("bridge: netlink dump
    interface at par with brctl"), to support iproute2's 'bridge fdb' command
    (not 'ip neigh') which indeed uses 'struct ifinfomsg', thus is not broken.

    So, in order to unbreak the 'struct ndmsg' family headers and still allow
    'struct ifinfomsg' to continue to work, check for the known message sizes
    used with 'struct ndmsg' in iproute2 (with zero or one attribute which is
    not used in this function anyway) then do not parse the data as ifinfomsg.

    Same examples with this patch applied (or revert/before the original fix):

    $ bridge fdb show
    33:33:00:00:00:01 dev ens3 self permanent
    01:00:5e:00:00:01 dev ens3 self permanent
    33:33:ff:15:98:30 dev ens3 self permanent

    $ ip --family bridge neigh
    dev ens3 lladdr 33:33:00:00:00:01 PERMANENT
    dev ens3 lladdr 01:00:5e:00:00:01 PERMANENT
    dev ens3 lladdr 33:33:ff:15:98:30 PERMANENT

    $ ./neighlist
    netlink.Neigh{LinkIndex:2, Family:7, State:128, Type:0, Flags:2, IP:net.IP(nil), HardwareAddr:net.HardwareAddr{0x33, 0x33, 0x0, 0x0, 0x0, 0x1}, LLIPAddr:net.IP(nil), Vlan:0, VNI:0}
    netlink.Neigh{LinkIndex:2, Family:7, State:128, Type:0, Flags:2, IP:net.IP(nil), HardwareAddr:net.HardwareAddr{0x1, 0x0, 0x5e, 0x0, 0x0, 0x1}, LLIPAddr:net.IP(nil), Vlan:0, VNI:0}
    netlink.Neigh{LinkIndex:2, Family:7, State:128, Type:0, Flags:2, IP:net.IP(nil), HardwareAddr:net.HardwareAddr{0x33, 0x33, 0xff, 0x15, 0x98, 0x30}, LLIPAddr:net.IP(nil), Vlan:0, VNI:0}

    Tested on mainline (v4.19-rc6) and net-next (3bd09b05b068).

    References:

    [1] netlink library for go (test-case)
    https://github.com/vishvananda/netlink

    $ cat ~/go/src/neighlist/main.go
    package main
    import ("fmt"; "syscall"; "github.com/vishvananda/netlink")
    func main() {
    neighs, _ := netlink.NeighList(0, syscall.AF_BRIDGE)
    for _, neigh := range neighs { fmt.Printf("%#v\n", neigh) }
    }

    $ export GOPATH=~/go
    $ go get github.com/vishvananda/netlink
    $ go build neighlist
    $ ~/go/src/neighlist/neighlist

    Thanks to David Ahern for suggestions to improve this patch.

    Fixes: 0ff50e83b512 ("net: rtnetlink: bail out from rtnl_fdb_dump() on parse error")
    Fixes: 5e6d24358799 ("bridge: netlink dump interface at par with brctl")
    Reported-by: Aidan Obley
    Signed-off-by: Mauricio Faria de Oliveira
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Mauricio Faria de Oliveira
     

04 Oct, 2018

1 commit


03 Oct, 2018

1 commit

  • We have an impressive number of syzkaller bugs that are linked
    to the fact that syzbot was able to create a networking device
    with millions of TX (or RX) queues.

    Let's limit the number of RX/TX queues to 4096, this really should
    cover all known cases.

    A separate patch will add various cond_resched() in the loops
    handling sysfs entries at device creation and dismantle.

    Tested:

    lpaa6:~# ip link add gre-4097 numtxqueues 4097 numrxqueues 4097 type ip6gretap
    RTNETLINK answers: Invalid argument

    lpaa6:~# time ip link add gre-4096 numtxqueues 4096 numrxqueues 4096 type ip6gretap

    real 0m0.180s
    user 0m0.000s
    sys 0m0.107s

    Fixes: 76ff5cc91935 ("rtnl: allow to specify number of rx and tx queues on device creation")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     

02 Oct, 2018

1 commit

  • Link dumps can return results from a target namespace. If the namespace id
    is invalid, then the dump request should fail if get_target_net fails
    rather than continuing with a dump of the current namespace.

    Fixes: 79e1ad148c844 ("rtnetlink: use netnsid to query interface")
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

26 Sep, 2018

1 commit

  • Rtnl lock is encapsulated in netlink and cannot be accessed by other
    modules directly. This means that reference counted objects that rely on
    rtnl lock cannot use it with refcounter helper function that atomically
    releases decrements reference and obtains mutex.

    This patch implements simple wrapper function around refcount_dec_and_lock
    that obtains rtnl lock if reference counter value reached 0.

    Signed-off-by: Vlad Buslov
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     

19 Sep, 2018

1 commit


14 Sep, 2018

1 commit

  • This fix addresses https://bugzilla.kernel.org/show_bug.cgi?id=201071

    Commit 5025f7f7d506 wrongly relied on __dev_change_flags to notify users of
    dev flag changes in the case when dev->rtnl_link_state = RTNL_LINK_INITIALIZED.
    Fix it by indicating flag changes explicitly to __dev_notify_flags.

    Fixes: 5025f7f7d506 ("rtnetlink: add rtnl_link_state check in rtnl_configure_link")
    Reported-By: Liam mcbirnie
    Signed-off-by: Roopa Prabhu
    Signed-off-by: David S. Miller

    Roopa Prabhu