26 Nov, 2016

5 commits

  • In commit 10724cc7bb78 ("tipc: redesign connection-level flow control")
    we replaced the previous message based flow control with one based on
    1k blocks. In order to ensure backwards compatibility the mechanism
    falls back to using message as base unit when it senses that the peer
    doesn't support the new algorithm. The default flow control window,
    i.e., how many units can be sent before the sender blocks and waits
    for an acknowledge (aka advertisement) is 512. This was tested against
    the previous version, which uses an acknowledge frequency of on ack per
    256 received message, and found to work fine.

    However, we missed the fact that versions older than Linux 3.15 use an
    acknowledge frequency of 512, which is exactly the limit where a 4.6+
    sender will stop and wait for acknowledge. This would also work fine if
    it weren't for the fact that if the first sent message on a 4.6+ server
    side is an empty SYNACK, this one is also is counted as a sent message,
    while it is not counted as a received message on a legacy 3.15-receiver.
    This leads to the sender always being one step ahead of the receiver, a
    scenario causing the sender to block after 512 sent messages, while the
    receiver only has registered 511 read messages. Hence, the legacy
    receiver is not trigged to send an acknowledge, with a permanently
    blocked sender as result.

    We solve this deadlock by simply allowing the sender to send one more
    message before it blocks, i.e., by a making minimal change to the
    condition used for determining connection congestion.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • The ETHTOOL_GLINKSETTINGS command is deprecating the ETHTOOL_GSET
    command and likewise it shouldn't require the CAP_NET_ADMIN capability.

    Signed-off-by: Miroslav Lichvar
    Signed-off-by: David S. Miller

    Miroslav Lichvar
     
  • In commit 35c55c9877f8 ("tipc: add neighbor monitoring framework") we
    added a data area to the link monitor STATE messages under the
    assumption that previous versions did not use any such data area.

    For versions older than Linux 4.3 this assumption is not correct. In
    those version, all STATE messages sent out from a node inadvertently
    contain a 16 byte data area containing a string; -a leftover from
    previous RESET messages which were using this during the setup phase.
    This string serves no purpose in STATE messages, and should no be there.

    Unfortunately, this data area is delivered to the link monitor
    framework, where a sanity check catches that it is not a correct domain
    record, and drops it. It also issues a rate limited warning about the
    event.

    Since such events occur much more frequently than anticipated, we now
    choose to remove the warning in order to not fill the kernel log with
    useless contents. We also make the sanity check stricter, to further
    reduce the risk that such data is inavertently admitted.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • commit 817298102b0b ("tipc: fix link priority propagation") introduced a
    compatibility problem between TIPC versions newer than Linux 4.6 and
    those older than Linux 4.4. In versions later than 4.4, link STATE
    messages only contain a non-zero link priority value when the sender
    wants the receiver to change its priority. This has the effect that the
    receiver resets itself in order to apply the new priority. This works
    well, and is consistent with the said commit.

    However, in versions older than 4.4 a valid link priority is present in
    all sent link STATE messages, leading to cyclic link establishment and
    reset on the 4.6+ node.

    We fix this by adding a test that the received value should not only
    be valid, but also differ from the current value in order to cause the
    receiving link endpoint to reset.

    Reported-by: Amar Nv
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • …ux/kernel/git/mkl/linux-can

    Marc Kleine-Budde says:

    ====================
    pull-request: can 2016-11-23

    this is a pull request for net/master.

    The patch by Oliver Hartkopp for the broadcast manager (bcm) fixes the
    CAN-FD support, which may cause an out-of-bounds access otherwise.
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     

25 Nov, 2016

4 commits

  • Johan Hedberg says:

    ====================
    pull request: bluetooth 2016-11-23

    Sorry about the late pull request for 4.9, but we have one more
    important Bluetooth patch that should make it to the release. It fixes
    connection creation for Bluetooth LE controllers that do not have a
    public address (only a random one).

    Please let me know if there are any issues pulling. Thanks.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Should pass valid filter handle, not the netlink flags.

    Fixes: 30a391a13ab92 ("net sched filters: pass netlink message flags in event notification")
    Signed-off-by: Roman Mashak
    Signed-off-by: Jamal Hadi Salim
    Reported-by: Cong Wang
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Roman Mashak
     
  • In commits 93821778def10 ("udp: Fix rcv socket locking") and
    f7ad74fef3af ("net/ipv6/udp: UDP encapsulation: break backlog_rcv into
    __udpv6_queue_rcv_skb") UDP backlog handlers were renamed, but UDPlite
    was forgotten.

    This leads to crashes if UDPlite header is pulled twice, which happens
    starting from commit e6afc8ace6dd ("udp: remove headers from UDP packets
    before queueing")

    Bug found by syzkaller team, thanks a lot guys !

    Note that backlog use in UDP/UDPlite is scheduled to be removed starting
    from linux-4.10, so this patch is only needed up to linux-4.9

    Fixes: 93821778def1 ("udp: Fix rcv socket locking")
    Fixes: f7ad74fef3af ("net/ipv6/udp: UDP encapsulation: break backlog_rcv into __udpv6_queue_rcv_skb")
    Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
    Signed-off-by: Eric Dumazet
    Reported-by: Andrey Konovalov
    Cc: Benjamin LaHaise
    Cc: Herbert Xu
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • When an ipv6 address has the tentative flag set, it can't be
    used as source for egress traffic, while the associated route,
    if any, can be looked up and even stored into some dst_cache.

    In the latter scenario, the source ipv6 address selected and
    stored in the cache is most probably wrong (e.g. with
    link-local scope) and the entity using the dst_cache will
    experience lack of ipv6 connectivity until said cache is
    cleared or invalidated.

    Overall this may cause lack of connectivity over most IPv6 tunnels
    (comprising geneve and vxlan), if the first egress packet reaches
    the tunnel before the DaD is completed for the used ipv6
    address.

    This patch bumps a new genid after that the IFA_F_TENTATIVE flag
    is cleared, so that dst_cache will be invalidated on
    next lookup and ipv6 connectivity restored.

    Fixes: 0c1d70af924b ("net: use dst_cache for vxlan device")
    Fixes: 468dfffcd762 ("geneve: add dst caching support")
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     

24 Nov, 2016

2 commits


23 Nov, 2016

3 commits

  • Since commit 6f3b911d5f29b98 ("can: bcm: add support for CAN FD frames") the
    CAN broadcast manager supports CAN and CAN FD data frames.

    As these data frames are embedded in struct can[fd]_frames which have a
    different length the access to the provided array of CAN frames became
    dependend of op->cfsiz. By using a struct canfd_frame pointer for the array of
    CAN frames the new offset calculation based on op->cfsiz was accidently applied
    to CAN FD frame element lengths.

    This fix makes the pointer to the arrays of the different CAN frame types a
    void pointer so that the offset calculation in bytes accesses the correct CAN
    frame elements.

    Reference: http://marc.info/?l=linux-netdev&m=147980658909653

    Reported-by: Andrey Konovalov
    Signed-off-by: Oliver Hartkopp
    Tested-by: Andrey Konovalov
    Cc: linux-stable
    Signed-off-by: Marc Kleine-Budde

    Oliver Hartkopp
     
  • The hci_get_route() API is used to look up local HCI devices, however
    so far it has been incapable of dealing with anything else than the
    public address of HCI devices. This completely breaks with LE-only HCI
    devices that do not come with a public address, but use a static
    random address instead.

    This patch exteds the hci_get_route() API with a src_type parameter
    that's used for comparing with the right address of each HCI device.

    Signed-off-by: Johan Hedberg
    Signed-off-by: Marcel Holtmann

    Johan Hedberg
     
  • Andre Noll reported panics after my recent fix (commit 34fad54c2537
    "net: __skb_flow_dissect() must cap its return value")

    After some more headaches, Alexander root caused the problem to
    init_default_flow_dissectors() being called too late, in case
    a network driver like IGB is not a module and receives DHCP message
    very early.

    Fix is to call init_default_flow_dissectors() much earlier,
    as it is a core infrastructure and does not depend on another
    kernel service.

    Fixes: 06635a35d13d4 ("flow_dissect: use programable dissector in skb_flow_dissect and friends")
    Signed-off-by: Eric Dumazet
    Reported-by: Andre Noll
    Diagnosed-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Eric Dumazet
     

22 Nov, 2016

3 commits

  • Pull networking fixes from David Miller:

    1) Clear congestion control state when changing algorithms on an
    existing socket, from Florian Westphal.

    2) Fix register bit values in altr_tse_pcs portion of stmmac driver,
    from Jia Jie Ho.

    3) Fix PTP handling in stammc driver for GMAC4, from Giuseppe
    CAVALLARO.

    4) Fix udplite multicast delivery handling, it ignores the udp_table
    parameter passed into the lookups, from Pablo Neira Ayuso.

    5) Synchronize the space estimated by rtnl_vfinfo_size and the space
    actually used by rtnl_fill_vfinfo. From Sabrina Dubroca.

    6) Fix memory leak in fib_info when splitting nodes, from Alexander
    Duyck.

    7) If a driver does a napi_hash_del() explicitily and not via
    netif_napi_del(), it must perform RCU synchronization as needed. Fix
    this in virtio-net and bnxt drivers, from Eric Dumazet.

    8) Likewise, it is not necessary to invoke napi_hash_del() is we are
    also doing neif_napi_del() in the same code path. Remove such calls
    from be2net and cxgb4 drivers, also from Eric Dumazet.

    9) Don't allocate an ID in peernet2id_alloc() if the netns is dead,
    from WANG Cong.

    10) Fix OF node and device struct leaks in of_mdio, from Johan Hovold.

    11) We cannot cache routes in ip6_tunnel when using inherited traffic
    classes, from Paolo Abeni.

    12) Fix several crashes and leaks in cpsw driver, from Johan Hovold.

    13) Splice operations cannot use freezable blocking calls in AF_UNIX,
    from WANG Cong.

    14) Link dump filtering by master device and kind support added an error
    in loop index updates during the dump if we actually do filter, fix
    from Zhang Shengju.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (59 commits)
    tcp: zero ca_priv area when switching cc algorithms
    net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit
    ethernet: stmmac: make DWMAC_STM32 depend on it's associated SoC
    tipc: eliminate obsolete socket locking policy description
    rtnl: fix the loop index update error in rtnl_dump_ifinfo()
    l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()
    net: macb: add check for dma mapping error in start_xmit()
    rtnetlink: fix FDB size computation
    netns: fix get_net_ns_by_fd(int pid) typo
    af_unix: conditionally use freezable blocking calls in read
    net: ethernet: ti: cpsw: fix fixed-link phy probe deferral
    net: ethernet: ti: cpsw: add missing sanity check
    net: ethernet: ti: cpsw: fix secondary-emac probe error path
    net: ethernet: ti: cpsw: fix of_node and phydev leaks
    net: ethernet: ti: cpsw: fix deferred probe
    net: ethernet: ti: cpsw: fix mdio device reference leak
    net: ethernet: ti: cpsw: fix bad register access in probe error path
    net: sky2: Fix shutdown crash
    cfg80211: limit scan results cache size
    net sched filters: pass netlink message flags in event notification
    ...

    Linus Torvalds
     
  • We need to zero out the private data area when application switches
    connection to different algorithm (TCP_CONGESTION setsockopt).

    When congestion ops get assigned at connect time everything is already
    zeroed because sk_alloc uses GFP_ZERO flag. But in the setsockopt case
    this contains whatever previous cc placed there.

    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     
  • The tc could return NET_XMIT_CN as one congestion notification, but
    it does not mean the packe is lost. Other modules like ipvlan,
    macvlan, and others treat NET_XMIT_CN as success too.
    So l2tp_eth_dev_xmit should add the NET_XMIT_CN check.

    Signed-off-by: Gao Feng
    Signed-off-by: David S. Miller

    Gao Feng
     

20 Nov, 2016

4 commits

  • The comment block in socket.c describing the locking policy is
    obsolete, and does not reflect current reality. We remove it in this
    commit.

    Since the current locking policy is much simpler and follows a
    mainstream approach, we see no need to add a new description.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • If the link is filtered out, loop index should also be updated. If not,
    loop index will not be correct.

    Fixes: dc599f76c22b0 ("net: Add support for filtering link dump by master device and kind")
    Signed-off-by: Zhang Shengju
    Acked-by: David Ahern
    Signed-off-by: David S. Miller

    Zhang Shengju
     
  • Lock socket before checking the SOCK_ZAPPED flag in l2tp_ip6_bind().
    Without lock, a concurrent call could modify the socket flags between
    the sock_flag(sk, SOCK_ZAPPED) test and the lock_sock() call. This way,
    a socket could be inserted twice in l2tp_ip6_bind_table. Releasing it
    would then leave a stale pointer there, generating use-after-free
    errors when walking through the list or modifying adjacent entries.

    BUG: KASAN: use-after-free in l2tp_ip6_close+0x22e/0x290 at addr ffff8800081b0ed8
    Write of size 8 by task syz-executor/10987
    CPU: 0 PID: 10987 Comm: syz-executor Not tainted 4.8.0+ #39
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
    ffff880031d97838 ffffffff829f835b ffff88001b5a1640 ffff8800081b0ec0
    ffff8800081b15a0 ffff8800081b6d20 ffff880031d97860 ffffffff8174d3cc
    ffff880031d978f0 ffff8800081b0e80 ffff88001b5a1640 ffff880031d978e0
    Call Trace:
    [] dump_stack+0xb3/0x118 lib/dump_stack.c:15
    [] kasan_object_err+0x1c/0x70 mm/kasan/report.c:156
    [< inline >] print_address_description mm/kasan/report.c:194
    [] kasan_report_error+0x1f6/0x4d0 mm/kasan/report.c:283
    [< inline >] kasan_report mm/kasan/report.c:303
    [] __asan_report_store8_noabort+0x3e/0x40 mm/kasan/report.c:329
    [< inline >] __write_once_size ./include/linux/compiler.h:249
    [< inline >] __hlist_del ./include/linux/list.h:622
    [< inline >] hlist_del_init ./include/linux/list.h:637
    [] l2tp_ip6_close+0x22e/0x290 net/l2tp/l2tp_ip6.c:239
    [] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415
    [] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422
    [] sock_release+0x8d/0x1d0 net/socket.c:570
    [] sock_close+0x16/0x20 net/socket.c:1017
    [] __fput+0x28c/0x780 fs/file_table.c:208
    [] ____fput+0x15/0x20 fs/file_table.c:244
    [] task_work_run+0xf9/0x170
    [] do_exit+0x85e/0x2a00
    [] do_group_exit+0x108/0x330
    [] get_signal+0x617/0x17a0 kernel/signal.c:2307
    [] do_signal+0x7f/0x18f0
    [] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156
    [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
    [] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259
    [] entry_SYSCALL_64_fastpath+0xc4/0xc6
    Object at ffff8800081b0ec0, in cache L2TP/IPv6 size: 1448
    Allocated:
    PID = 10987
    [ 1116.897025] [] save_stack_trace+0x16/0x20
    [ 1116.897025] [] save_stack+0x46/0xd0
    [ 1116.897025] [] kasan_kmalloc+0xad/0xe0
    [ 1116.897025] [] kasan_slab_alloc+0x12/0x20
    [ 1116.897025] [< inline >] slab_post_alloc_hook mm/slab.h:417
    [ 1116.897025] [< inline >] slab_alloc_node mm/slub.c:2708
    [ 1116.897025] [< inline >] slab_alloc mm/slub.c:2716
    [ 1116.897025] [] kmem_cache_alloc+0xc8/0x2b0 mm/slub.c:2721
    [ 1116.897025] [] sk_prot_alloc+0x69/0x2b0 net/core/sock.c:1326
    [ 1116.897025] [] sk_alloc+0x38/0xae0 net/core/sock.c:1388
    [ 1116.897025] [] inet6_create+0x2d7/0x1000 net/ipv6/af_inet6.c:182
    [ 1116.897025] [] __sock_create+0x37b/0x640 net/socket.c:1153
    [ 1116.897025] [< inline >] sock_create net/socket.c:1193
    [ 1116.897025] [< inline >] SYSC_socket net/socket.c:1223
    [ 1116.897025] [] SyS_socket+0xef/0x1b0 net/socket.c:1203
    [ 1116.897025] [] entry_SYSCALL_64_fastpath+0x23/0xc6
    Freed:
    PID = 10987
    [ 1116.897025] [] save_stack_trace+0x16/0x20
    [ 1116.897025] [] save_stack+0x46/0xd0
    [ 1116.897025] [] kasan_slab_free+0x71/0xb0
    [ 1116.897025] [< inline >] slab_free_hook mm/slub.c:1352
    [ 1116.897025] [< inline >] slab_free_freelist_hook mm/slub.c:1374
    [ 1116.897025] [< inline >] slab_free mm/slub.c:2951
    [ 1116.897025] [] kmem_cache_free+0xc8/0x330 mm/slub.c:2973
    [ 1116.897025] [< inline >] sk_prot_free net/core/sock.c:1369
    [ 1116.897025] [] __sk_destruct+0x32b/0x4f0 net/core/sock.c:1444
    [ 1116.897025] [] sk_destruct+0x44/0x80 net/core/sock.c:1452
    [ 1116.897025] [] __sk_free+0x53/0x220 net/core/sock.c:1460
    [ 1116.897025] [] sk_free+0x23/0x30 net/core/sock.c:1471
    [ 1116.897025] [] sk_common_release+0x28c/0x3e0 ./include/net/sock.h:1589
    [ 1116.897025] [] l2tp_ip6_close+0x1fe/0x290 net/l2tp/l2tp_ip6.c:243
    [ 1116.897025] [] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415
    [ 1116.897025] [] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422
    [ 1116.897025] [] sock_release+0x8d/0x1d0 net/socket.c:570
    [ 1116.897025] [] sock_close+0x16/0x20 net/socket.c:1017
    [ 1116.897025] [] __fput+0x28c/0x780 fs/file_table.c:208
    [ 1116.897025] [] ____fput+0x15/0x20 fs/file_table.c:244
    [ 1116.897025] [] task_work_run+0xf9/0x170
    [ 1116.897025] [] do_exit+0x85e/0x2a00
    [ 1116.897025] [] do_group_exit+0x108/0x330
    [ 1116.897025] [] get_signal+0x617/0x17a0 kernel/signal.c:2307
    [ 1116.897025] [] do_signal+0x7f/0x18f0
    [ 1116.897025] [] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156
    [ 1116.897025] [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
    [ 1116.897025] [] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259
    [ 1116.897025] [] entry_SYSCALL_64_fastpath+0xc4/0xc6
    Memory state around the buggy address:
    ffff8800081b0d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ffff8800081b0e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    >ffff8800081b0e80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
    ^
    ffff8800081b0f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ffff8800081b0f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

    ==================================================================

    The same issue exists with l2tp_ip_bind() and l2tp_ip_bind_table.

    Fixes: c51ce49735c1 ("l2tp: fix oops in L2TP IP sockets for connect() AF_UNSPEC case")
    Reported-by: Baozeng Ding
    Reported-by: Andrey Konovalov
    Tested-by: Baozeng Ding
    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • Simon Wunderlich says:

    ====================
    Here are two batman-adv bugfix patches:

    - Revert a splat on disabling interface which created another problem,
    by Sven Eckelmann

    - Fix error handling when the primary interface disappears during a
    throughput meter test, by Sven Eckelmann
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

19 Nov, 2016

4 commits

  • Pull nfsd bugfix from Bruce Fields:
    "Just one fix for an NFS/RDMA crash"

    * tag 'nfsd-4.9-2' of git://linux-nfs.org/~bfields/linux:
    sunrpc: svc_age_temp_xprts_now should not call setsockopt non-tcp transports

    Linus Torvalds
     
  • Add missing NDA_VLAN attribute's size.

    Fixes: 1e53d5bb8878 ("net: Pass VLAN ID to rtnl_fdb_notify.")
    Signed-off-by: Sabrina Dubroca
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     
  • …kernel/git/jberg/mac80211

    Johannes Berg says:

    ====================
    A few more bugfixes:
    * limit # of scan results stored in memory - this is a long-standing bug
    Jouni and I only noticed while discussing other things in Santa Fe
    * revert AP_LINK_PS patch that was causing issues (Felix)
    * various A-MSDU/A-MPDU fixes for TXQ code (Felix)
    * interoperability workaround for peers with broken VHT capabilities
    (Filip Matusiak)
    * add bitrate definition for a VHT MCS that's supposed to be invalid
    but gets used by some hardware anyway (Thomas Pedersen)
    * beacon timer fix in hwsim (Benjamin Beichler)
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • Commit 2b15af6f95 ("af_unix: use freezable blocking calls in read")
    converts schedule_timeout() to its freezable version, it was probably
    correct at that time, but later, commit 2b514574f7e8
    ("net: af_unix: implement splice for stream af_unix sockets") breaks
    the strong requirement for a freezable sleep, according to
    commit 0f9548ca1091:

    We shouldn't try_to_freeze if locks are held. Holding a lock can cause a
    deadlock if the lock is later acquired in the suspend or hibernate path
    (e.g. by dpm). Holding a lock can also cause a deadlock in the case of
    cgroup_freezer if a lock is held inside a frozen cgroup that is later
    acquired by a process outside that group.

    The pipe_lock is still held at that point.

    So use freezable version only for the recvmsg call path, avoid impact for
    Android.

    Fixes: 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets")
    Reported-by: Dmitry Vyukov
    Cc: Tejun Heo
    Cc: Colin Cross
    Cc: Rafael J. Wysocki
    Cc: Hannes Frederic Sowa
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    WANG Cong
     

18 Nov, 2016

4 commits

  • It's possible to make scanning consume almost arbitrary amounts
    of memory, e.g. by sending beacon frames with random BSSIDs at
    high rates while somebody is scanning.

    Limit the number of BSS table entries we're willing to cache to
    1000, limiting maximum memory usage to maybe 4-5MB, but lower
    in practice - that would be the case for having both full-sized
    beacon and probe response frames for each entry; this seems not
    possible in practice, so a limit of 1000 entries will likely be
    closer to 0.5 MB.

    Cc: stable@vger.kernel.org
    Signed-off-by: Johannes Berg

    Johannes Berg
     
  • Userland client should be able to read an event, and reflect it back to
    the kernel, therefore it needs to extract complete set of netlink flags.

    For example, this will allow "tc monitor" to distinguish Add and Replace
    operations.

    Signed-off-by: Roman Mashak
    Signed-off-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Roman Mashak
     
  • If an ip6 tunnel is configured to inherit the traffic class from
    the inner header, the dst_cache must be disabled or it will foul
    the policy routing.

    The issue is apprently there since at leat Linux-2.6.12-rc2.

    Reported-by: Liam McBirnie
    Cc: Liam McBirnie
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     
  • Andrei reports we still allocate netns ID from idr after we destroy
    it in cleanup_net().

    cleanup_net():
    ...
    idr_destroy(&net->netns_ids);
    ...
    list_for_each_entry_reverse(ops, &pernet_list, list)
    ops_exit_list(ops, &net_exit_list);
    -> rollback_registered_many()
    -> rtmsg_ifinfo_build_skb()
    -> rtnl_fill_ifinfo()
    -> peernet2id_alloc()

    After that point we should not even access net->netns_ids, we
    should check the death of the current netns as early as we can in
    peernet2id_alloc().

    For net-next we can consider to avoid sending rtmsg totally,
    it is a good optimization for netns teardown path.

    Fixes: 0c7aecd4bde4 ("netns: add rtnl cmd to add and get peer netns ids")
    Reported-by: Andrei Vagin
    Cc: Nicolas Dichtel
    Signed-off-by: Cong Wang
    Acked-by: Andrei Vagin
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    WANG Cong
     

17 Nov, 2016

3 commits

  • The IOP_XATTR flag is set on sockfs because sockfs supports getting the
    "system.sockprotoname" xattr. Since commit 6c6ef9f2, this flag is checked for
    setxattr support as well. This is wrong on sockfs because security xattr
    support there is supposed to be provided by security_inode_setsecurity. The
    smack security module relies on socket labels (xattrs).

    Fix this by adding a security xattr handler on sockfs that returns
    -EAGAIN, and by checking for -EAGAIN in setxattr.

    We cannot simply check for -EOPNOTSUPP in setxattr because there are
    filesystems that neither have direct security xattr support nor support
    via security_inode_setsecurity. A more proper fix might be to move the
    call to security_inode_setsecurity into sockfs, but it's not clear to me
    if that is safe: we would end up calling security_inode_post_setxattr after
    that as well.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • Fix a small memory leak that can occur where we leak a fib_alias in the
    event of us not being able to insert it into the local table.

    Fixes: 0ddcf43d5d4a0 ("ipv4: FIB Local/MAIN table collapse")
    Reported-by: Eric Dumazet
    Signed-off-by: Alexander Duyck
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Alexander Duyck
     
  • The patch that removed the FIB offload infrastructure was a bit too
    aggressive and also removed code needed to clean up us splitting the table
    if additional rules were added. Specifically the function
    fib_trie_flush_external was called at the end of a new rule being added to
    flush the foreign trie entries from the main trie.

    I updated the code so that we only call fib_trie_flush_external on the main
    table so that we flush the entries for local from main. This way we don't
    call it for every rule change which is what was happening previously.

    Fixes: 347e3b28c1ba2 ("switchdev: remove FIB offload infrastructure")
    Reported-by: Eric Dumazet
    Cc: Jiri Pirko
    Signed-off-by: Alexander Duyck
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Alexander Duyck
     

16 Nov, 2016

4 commits

  • rtnl_xdp_size() only considers the size of the actual payload attribute,
    and misses the space taken by the attribute used for nesting (IFLA_XDP).

    Fixes: d1fdd9138682 ("rtnl: add option for setting link xdp prog")
    Signed-off-by: Sabrina Dubroca
    Reviewed-by: Brenden Blanco
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     
  • The size reported by rtnl_vfinfo_size doesn't match the space used by
    rtnl_fill_vfinfo.

    rtnl_vfinfo_size currently doesn't account for the nest attributes
    used by statistics (added in commit 3b766cd83232), nor for struct
    ifla_vf_tx_rate (since commit ed616689a3d9, which added ifla_vf_rate
    to the dump without removing ifla_vf_tx_rate, but replaced
    ifla_vf_tx_rate with ifla_vf_rate in the size computation).

    Fixes: 3b766cd83232 ("net/core: Add reading VF statistics through the PF netdevice")
    Fixes: ed616689a3d9 ("net-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool")
    Signed-off-by: Sabrina Dubroca
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     
  • Honor udptable parameter that is passed to __udp*_lib_mcast_deliver(),
    otherwise udplite broadcast/multicast use the wrong table and it breaks.

    Fixes: 2dc41cff7545 ("udp: Use hash2 for long hash1 chains in __udp*_lib_mcast_deliver.")
    Signed-off-by: Pablo Neira Ayuso
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Pablo Neira
     
  • In commit 24cf3af3fed5 ("igmp: call ip_mc_clear_src..."), we forgot to remove
    igmpv3_clear_delrec() in ip_mc_down(), which also called ip_mc_clear_src().
    This make us clear all IGMPv3 source filter info after NETDEV_DOWN.
    Move igmpv3_clear_delrec() to ip_mc_destroy_dev() and then no need
    ip_mc_clear_src() in ip_mc_destroy_dev().

    On the other hand, we should restore back instead of free all source filter
    info in igmpv3_del_delrec(). Or we will not able to restore IGMPv3 source
    filter info after NETDEV_UP and NETDEV_POST_TYPE_CHANGE.

    Fixes: 24cf3af3fed5 ("igmp: call ip_mc_clear_src() only when ...")
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller

    Hangbin Liu
     

15 Nov, 2016

4 commits

  • A-MSDU aggregation alters the QoS header after a frame has been
    enqueued, so it needs to be ready before enqueue and not overwritten
    again afterwards

    Fixes: bb42f2d13ffc ("mac80211: Move reorder-sensitive TX handlers to after TXQ dequeue")
    Signed-off-by: Felix Fietkau
    Acked-by: Toke Høiland-Jørgensen
    Signed-off-by: Johannes Berg

    Felix Fietkau
     
  • The call to ieee80211_txq_enqueue overwrites the vif pointer with the
    codel enqueue time, so setting it just before that call makes no sense.

    Signed-off-by: Felix Fietkau
    Acked-by: Toke Høiland-Jørgensen
    Signed-off-by: Johannes Berg

    Felix Fietkau
     
  • The sequence number counter is used to derive the starting sequence
    number. Since that counter is updated on tx dequeue, the A-MPDU flag
    needs to be up to date at the tme of dequeue as well.

    This patch prevents sending more A-MPDU frames after the session has
    been terminated and also ensures that aggregation starts right after the
    session has been established

    Fixes: bb42f2d13ffc ("mac80211: Move reorder-sensitive TX handlers to after TXQ dequeue")
    Signed-off-by: Felix Fietkau
    Acked-by: Toke Høiland-Jørgensen
    Signed-off-by: Johannes Berg

    Felix Fietkau
     
  • Some drivers (ath10k) report MCS 9 @ 20MHz, which
    technically isn't defined. To get more meaningful value
    than 0 out of this however, just extrapolate a bitrate
    from ratio of MCS 7 and 9 in channels where it is allowed.

    Signed-off-by: Thomas Pedersen
    [add a comment about it in the code]
    Signed-off-by: Johannes Berg

    Pedersen, Thomas