17 Feb, 2016

1 commit

  • The present unix_stream_read_generic contains various code sequences of
    the form

    err = -EDISASTER;
    if ()
    goto out;

    This has the unfortunate side effect of possibly causing the error code
    to bleed through to the final

    out:
    return copied ? : err;

    and then to be wrongly returned if no data was copied because the caller
    didn't supply a data buffer, as demonstrated by the program available at

    http://pad.lv/1540731

    Change it such that err is only set if an error condition was detected.

    Fixes: 3822b5c2fc62 ("af_unix: Revert 'lock_interruptible' in stream receive code")
    Reported-by: Joseph Salisbury
    Signed-off-by: Rainer Weikusat
    Signed-off-by: David S. Miller

    Rainer Weikusat
     

13 Feb, 2016

19 commits

  • My analysis in the below mail applies, although the second part is
    unnecessary because i isn't used in arithmetic operations here:

    https://marc.info/?l=openbsd-tech&m=145377854103866&w=2

    Thanks for your time.

    Signed-off-by: Michael McConville
    Acked-by: Francois Romieu
    Signed-off-by: David S. Miller

    Michael McConville
     
  • BRIDGE_VLAN_FILTERING automatically adds a newly bridged port to the
    VLAN with the bridge's default_pvid.

    The mv88e6xxx driver currently reserves VLANs 4000+ for unbridged ports
    isolation. When a port joins a bridge, it leaves its reserved VLAN. When
    a port leaves a bridge, it joins again its reserved VLAN.

    But if the VLAN filtering is disabled, or if this hardware VLAN is
    already in use, the bridged port ends up with no default VLAN, and the
    communication with the CPU is thus broken.

    To fix this, make a port join its reserved VLAN once on setup, never
    leave it, and restore its PVID after another one was eventually used.

    Signed-off-by: Vivien Didelot
    Tested-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • The current bridge code calls switchdev_port_obj_del on a VLAN port even
    if the corresponding switchdev_port_obj_add call returned -EOPNOTSUPP.

    If the DSA driver doesn't return -EOPNOTSUPP for a software port VLAN in
    its port_vlan_del function, the VLAN is not deleted. Unbridging the port
    also generates a stack trace for the same reason.

    This can be quickly tested on a VLAN filtering enabled system with:

    # brctl addbr br0
    # brctl addif br0 lan0
    # brctl addbr br1
    # brctl addif br1 lan1
    # brctl delif br1 lan1

    Both bridges have a default default_pvid set to 1. lan0 uses the
    hardware VLAN 1 while lan1 falls back to the software VLAN 1.

    Unbridging lan1 does not delete its software VLAN, and thus generates
    the following stack trace:

    [ 2991.681705] device lan1 left promiscuous mode
    [ 2991.686237] br1: port 1(lan1) entered disabled state
    [ 2991.725094] ------------[ cut here ]------------
    [ 2991.729761] WARNING: CPU: 0 PID: 869 at net/bridge/br_vlan.c:314 __vlan_group_free+0x4c/0x50()
    [ 2991.738437] Modules linked in:
    [ 2991.741546] CPU: 0 PID: 869 Comm: ip Not tainted 4.4.0 #16
    [ 2991.747039] Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
    [ 2991.753511] Backtrace:
    [ 2991.756008] [] (dump_backtrace) from [] (show_stack+0x20/0x24)
    [ 2991.763604] r6:80512644 r5:00000009 r4:00000000 r3:00000000
    [ 2991.769343] [] (show_stack) from [] (dump_stack+0x24/0x28)
    [ 2991.776618] [] (dump_stack) from [] (warn_slowpath_common+0x98/0xc4)
    [ 2991.784750] [] (warn_slowpath_common) from [] (warn_slowpath_null+0x2c/0x34)
    [ 2991.793557] r8:00000000 r7:9f786a8c r6:9f76c440 r5:9f786a00 r4:9f68ac00
    [ 2991.800366] [] (warn_slowpath_null) from [] (__vlan_group_free+0x4c/0x50)
    [ 2991.808946] [] (__vlan_group_free) from [] (nbp_vlan_flush+0x44/0x68)
    [ 2991.817147] r4:9f68ac00 r3:9ec70000
    [ 2991.820772] [] (nbp_vlan_flush) from [] (del_nbp+0xac/0x130)
    [ 2991.828201] r5:9f56f800 r4:9f786a00
    [ 2991.831841] [] (del_nbp) from [] (br_del_if+0x40/0xbc)
    [ 2991.838724] r7:80590f68 r6:00000000 r5:9ec71c38 r4:9f76c440
    [ 2991.844475] [] (br_del_if) from [] (br_del_slave+0x1c/0x20)
    [ 2991.851802] r5:9ec71c38 r4:9f56f800
    [ 2991.855428] [] (br_del_slave) from [] (do_setlink+0x324/0x7b8)
    [ 2991.863043] [] (do_setlink) from [] (rtnl_newlink+0x508/0x6f4)
    [ 2991.870616] r10:00000000 r9:9ec71ba8 r8:00000000 r7:00000000 r6:9f6b0400 r5:9f56f800
    [ 2991.878548] r4:8076278c
    [ 2991.881110] [] (rtnl_newlink) from [] (rtnetlink_rcv_msg+0x18c/0x22c)
    [ 2991.889315] r10:9f7d4e40 r9:00000000 r8:00000000 r7:00000000 r6:9f7d4e40 r5:9f6b0400
    [ 2991.897250] r4:00000000
    [ 2991.899814] [] (rtnetlink_rcv_msg) from [] (netlink_rcv_skb+0xb0/0xcc)
    [ 2991.908104] r8:00000000 r7:9f7d4e40 r6:9f7d4e40 r5:80483ebc r4:9f6b0400
    [ 2991.914928] [] (netlink_rcv_skb) from [] (rtnetlink_rcv+0x34/0x3c)
    [ 2991.922874] r6:9f5ea000 r5:00000028 r4:9f7d4e40 r3:80483e80
    [ 2991.928622] [] (rtnetlink_rcv) from [] (netlink_unicast+0x180/0x200)
    [ 2991.936742] r4:9f4edc00 r3:80483e80
    [ 2991.940362] [] (netlink_unicast) from [] (netlink_sendmsg+0x33c/0x350)
    [ 2991.948648] r8:00000000 r7:00000028 r6:00000000 r5:9f5ea000 r4:9ec71f4c
    [ 2991.955481] [] (netlink_sendmsg) from [] (sock_sendmsg+0x24/0x34)
    [ 2991.963342] r10:00000000 r9:9ec71e28 r8:00000000 r7:9f1e2140 r6:00000000 r5:00000000
    [ 2991.971276] r4:9ec71f4c
    [ 2991.973849] [] (sock_sendmsg) from [] (___sys_sendmsg+0x1fc/0x204)
    [ 2991.981809] [] (___sys_sendmsg) from [] (__sys_sendmsg+0x4c/0x7c)
    [ 2991.989640] r10:00000000 r9:9ec70000 r8:80010824 r7:00000128 r6:7ee946c4 r5:00000000
    [ 2991.997572] r4:9f1e2140
    [ 2992.000128] [] (__sys_sendmsg) from [] (SyS_sendmsg+0x18/0x1c)
    [ 2992.007725] r6:00000000 r5:7ee9c7b8 r4:7ee946e0
    [ 2992.012430] [] (SyS_sendmsg) from [] (ret_fast_syscall+0x0/0x3c)
    [ 2992.020182] ---[ end trace 5d4bc29f4da04280 ]---

    To fix this, return -EOPNOTSUPP in _mv88e6xxx_port_vlan_del instead of
    -ENOENT if the hardware VLAN doesn't exist or the port is not a member.

    Signed-off-by: Vivien Didelot
    Tested-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • smatch detected a suspicious looking bitop condition:

    drivers/net/ethernet/cavium/liquidio/lio_main.c:2529
    handle_timestamp() warn: suspicious bitop condition

    (skb_shinfo(skb)->tx_flags | SKBTX_IN_PROGRESS is always non-zero,
    so the logic is definitely not correct. Use & to mask the correct
    bit.

    Signed-off-by: Colin Ian King
    Signed-off-by: David S. Miller

    Colin Ian King
     
  • Commit c0eb454034aa ("hv_netvsc: Don't ask for additional head room in the
    skb") got rid of needed_headroom setting for the driver. With the change I
    hit the following issue trying to use ptkgen module:

    [ 57.522021] kernel BUG at net/core/skbuff.c:1128!
    [ 57.522021] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
    ...
    [ 58.721068] Call Trace:
    [ 58.721068] [] netvsc_start_xmit+0x4c6/0x8e0 [hv_netvsc]
    ...
    [ 58.721068] [] ? pktgen_finalize_skb+0x25c/0x2a0 [pktgen]
    [ 58.721068] [] ? __netdev_alloc_skb+0xc0/0x100
    [ 58.721068] [] pktgen_thread_worker+0x257/0x1920 [pktgen]

    Basically, we're calling skb_cow_head(skb, RNDIS_AND_PPI_SIZE) and crash on
    if (skb_shared(skb))
    BUG();

    We probably need to restore needed_headroom setting (but shrunk to
    RNDIS_AND_PPI_SIZE as we don't need more) to request the required headroom
    space. In theory, it should not give us performance penalty.

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: David S. Miller

    Vitaly Kuznetsov
     
  • Gregory CLEMENT says:

    ====================
    mvneta fixes for SMP

    Following this bug report:
    http://thread.gmane.org/gmane.linux.ports.arm.kernel/468173 and the
    suggestions from Russell King, I reviewed all the code involving
    multi-CPU. It ended with this series of patches which should improve
    the stability of the driver.

    During my test I found another bug which is fixed by new patch (the
    second one of this new version of the series)

    The two first patches fix real bugs, the others fix potential issues
    in the driver.

    Changelog:

    v1 -> v2
    Fix spinlock comment. Pointed by David Miller

    v2 -> v3
    - Fix typos and mistake in the comments. Pointed by Sergei Shtylyov
    - Add a new patch fixing the CPU choice in mvneta_percpu_elect
    - Use lock in last patch to prevent remaining race condition. Pointed
    by Jisheng
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • When stopping the port, the CPU notifier are still there whereas the
    mvneta_stop_dev function calls mvneta_percpu_disable() on each CPUs.
    It was possible to have a new CPU coming at this point which could be
    racy.

    This patch adds a flag preventing executing the code notifier for a new
    CPU when the port is stopping. It also uses the spinlock introduces
    previously. To avoid the deadlock, the lock has been moved outside the
    mvneta_percpu_elect function.

    Signed-off-by: Gregory CLEMENT
    Signed-off-by: David S. Miller

    Gregory CLEMENT
     
  • Electing a CPU must be done in an atomic way: it should be done after or
    before the removal/insertion of a CPU and this function is not reentrant.

    During the loop of mvneta_percpu_elect we associates the queues to the
    CPUs, if there is a topology change during this loop, then the mapping
    between the CPUs and the queues could be wrong. During this loop the
    interrupt mask is also updating for each CPUs, It should not be changed
    in the same time by other part of the driver.

    This patch adds spinlock to create the needed critical sections.

    Signed-off-by: Gregory CLEMENT
    Signed-off-by: David S. Miller

    Gregory CLEMENT
     
  • In the MVNETA_INTR_* registers, the queues related fields are per cpu,
    according to the datasheet (comment in [] are added by me):
    "In a multi-CPU system, bits of RX[or TX] queues for which the access by
    the reading[or writing] CPU is disabled are read as 0, and cannot be
    cleared[or written]."

    That means that each time we want to manipulate these bits we had to do
    it on each cpu and not only on the current cpu.

    Signed-off-by: Gregory CLEMENT
    Signed-off-by: David S. Miller

    Gregory CLEMENT
     
  • Since the commit 2dcf75e2793c ("net: mvneta: Associate RX queues with
    each CPU") all the percpu irq are used and disabled at initialization, so
    there is no point to disable them first.

    Signed-off-by: Gregory CLEMENT
    Signed-off-by: David S. Miller

    Gregory CLEMENT
     
  • Instead of using a for_each_* loop in which we just call the
    smp_call_function_single macro, it is more simple to directly use the
    on_each_cpu macro. Moreover, this macro ensures that the calls will be
    done all at once.

    Suggested-by: Russell King
    Signed-off-by: Gregory CLEMENT
    Signed-off-by: David S. Miller

    Gregory CLEMENT
     
  • When passing to the management of multiple RX queue, the
    mvneta_percpu_elect function was broken. The use of the modulo can lead
    to elect the wrong cpu. For example with rxq_def=2, if the CPU 2 goes
    offline and then online, we ended with the third RX queue activated in
    the same time on CPU 0 and CPU2, which lead to a kernel crash.

    With this fix, we don't try to get "the closer" CPU if the default CPU is
    gone, now we just use CPU 0 which always be there. Thanks to this, the
    code becomes more readable, easier to maintain and more predicable.

    Cc: stable@vger.kernel.org
    Fixes: 2dcf75e2793c ("net: mvneta: Associate RX queues with each CPU")
    Signed-off-by: Gregory CLEMENT
    Signed-off-by: David S. Miller

    Gregory CLEMENT
     
  • This patch convert the for_each_present in on_each_cpu, instead of
    applying on the present cpus it will be applied only on the online cpus.
    This fix a bug reported on
    http://thread.gmane.org/gmane.linux.ports.arm.kernel/468173.

    Using the macro on_each_cpu (instead of a for_each_* loop) also ensures
    that all the calls will be done all at once.

    Fixes: f86428854480 ("net: mvneta: Statically assign queues to CPUs")
    Reported-by: Stefan Roese
    Suggested-by: Jisheng Zhang
    Suggested-by: Russell King
    Signed-off-by: Gregory CLEMENT
    Signed-off-by: David S. Miller

    Gregory CLEMENT
     
  • We receoved a bug report from someone using vmware:

    WARNING: CPU: 3 PID: 660 at kernel/sched/core.c:7389
    __might_sleep+0x7d/0x90()
    do not call blocking ops when !TASK_RUNNING; state=1 set at
    [] prepare_to_wait+0x2d/0x90
    Modules linked in: vmw_vsock_vmci_transport vsock snd_seq_midi
    snd_seq_midi_event snd_ens1371 iosf_mbi gameport snd_rawmidi
    snd_ac97_codec ac97_bus snd_seq coretemp snd_seq_device snd_pcm
    snd_timer snd soundcore ppdev crct10dif_pclmul crc32_pclmul
    ghash_clmulni_intel vmw_vmci vmw_balloon i2c_piix4 shpchp parport_pc
    parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc btrfs
    xor raid6_pq 8021q garp stp llc mrp crc32c_intel serio_raw mptspi vmwgfx
    drm_kms_helper ttm drm scsi_transport_spi mptscsih e1000 ata_generic
    mptbase pata_acpi
    CPU: 3 PID: 660 Comm: vmtoolsd Not tainted
    4.2.0-0.rc1.git3.1.fc23.x86_64 #1
    Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop
    Reference Platform, BIOS 6.00 05/20/2014
    0000000000000000 0000000049e617f3 ffff88006ac37ac8 ffffffff818641f5
    0000000000000000 ffff88006ac37b20 ffff88006ac37b08 ffffffff810ab446
    ffff880068009f40 ffffffff81c63bc0 0000000000000061 0000000000000000
    Call Trace:
    [] dump_stack+0x4c/0x65
    [] warn_slowpath_common+0x86/0xc0
    [] warn_slowpath_fmt+0x55/0x70
    [] ? debug_lockdep_rcu_enabled+0x1d/0x20
    [] ? prepare_to_wait+0x2d/0x90
    [] ? prepare_to_wait+0x2d/0x90
    [] __might_sleep+0x7d/0x90
    [] __might_fault+0x43/0xa0
    [] copy_from_iter+0x87/0x2a0
    [] __qp_memcpy_to_queue+0x9a/0x1b0 [vmw_vmci]
    [] ? qp_memcpy_to_queue+0x20/0x20 [vmw_vmci]
    [] qp_memcpy_to_queue_iov+0x17/0x20 [vmw_vmci]
    [] qp_enqueue_locked+0xa0/0x140 [vmw_vmci]
    [] vmci_qpair_enquev+0x4f/0xd0 [vmw_vmci]
    [] vmci_transport_stream_enqueue+0x1b/0x20
    [vmw_vsock_vmci_transport]
    [] vsock_stream_sendmsg+0x2c5/0x320 [vsock]
    [] ? wake_atomic_t_function+0x70/0x70
    [] sock_sendmsg+0x38/0x50
    [] SYSC_sendto+0x104/0x190
    [] ? vfs_read+0x8a/0x140
    [] SyS_sendto+0xe/0x10
    [] entry_SYSCALL_64_fastpath+0x12/0x76

    transport->stream_enqueue may call copy_to_user so it should
    not be called inside a prepare_to_wait. Narrow the scope of
    the prepare_to_wait to avoid the bad call. This also applies
    to vsock_stream_recvmsg as well.

    Reported-by: Vinson Lee
    Tested-by: Vinson Lee
    Signed-off-by: Laura Abbott
    Signed-off-by: David S. Miller

    Laura Abbott
     
  • There are typos in setting RTL8168H hardware parameters. If system install
    another version driver that may cuase system hang.

    Signed-off-by: Chunhao Lin
    Signed-off-by: David S. Miller

    Chun-Hao Lin
     
  • Dmitry reported memory leaks of IP options allocated in
    ip_cmsg_send() when/if this function returns an error.

    Callers are responsible for the freeing.

    Many thanks to Dmitry for the report and diagnostic.

    Reported-by: Dmitry Vyukov
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • The return value of kzalloc on failure of allocation of memory should
    be -ENOMEM and not -1.

    Found using Coccinelle. A simplified version of the semantic patch
    used is:

    //
    @@
    expression *e;
    position p,q;
    @@

    e@q = kzalloc(...);
    if@p (e == NULL) {
    ...
    return
    - -1
    + -ENOMEM
    ;
    }
    //

    This function may also return -1 after calling mpp2_prs_tcam_port_map_get.
    So that the function consistently returns meaningful error values on
    failure, the -1 is changed to -EINVAL.

    Signed-off-by: Amitoj Kaur Chawla
    Signed-off-by: David S. Miller

    Amitoj Kaur Chawla
     
  • The return value of vmalloc on failure of allocation of memory should
    be -ENOMEM and not -1.

    Found using Coccinelle. A simplified version of the semantic patch
    used is:

    //
    @@
    expression *e;
    identifier l1;
    position p,q;
    @@

    e@q = vmalloc(...);
    if@p (e == NULL) {
    ...
    goto l1;
    }
    l1:
    ...
    return -1
    + -ENOMEM
    ;
    //
    Signed-off-by: David S. Miller

    Amitoj Kaur Chawla
     
  • The current logic in bond_arp_rcv will accept an incoming ARP for
    validation if (a) the receiving slave is either "active" (which includes
    the currently active slave, or the current ARP slave) or, (b) there is a
    currently active slave, and it has received an ARP since it became active.
    For case (b), the receiving slave isn't the currently active slave, and is
    receiving the original broadcast ARP request, not an ARP reply from the
    target.

    This logic can fail if there is no currently active slave. In
    this situation, the ARP probe logic cycles through all slaves, assigning
    each in turn as the "current_arp_slave" for one arp_interval, then setting
    that one as "active," and sending an ARP probe from that slave. The
    current logic expects the ARP reply to arrive on the sending
    current_arp_slave, however, due to switch FDB updating delays, the reply
    may be directed to another slave.

    This can arise if the bonding slaves and switch are working, but
    the ARP target is not responding. When the ARP target recovers, a
    condition may result wherein the ARP target host replies faster than the
    switch can update its forwarding table, causing each ARP reply to be sent
    to the previous current_arp_slave. This will never pass the logic in
    bond_arp_rcv, as neither of the above conditions (a) or (b) are met.

    Some experimentation on a LAN shows ARP reply round trips in the
    200 usec range, but my available switches never update their FDB in less
    than 4000 usec.

    This patch changes the logic in bond_arp_rcv to additionally
    accept an ARP reply for validation on any slave if there is a current ARP
    slave and it sent an ARP probe during the previous arp_interval.

    Fixes: aeea64ac717a ("bonding: don't trust arp requests unless active slave really works")
    Cc: Veaceslav Falico
    Cc: Andy Gospodarek
    Signed-off-by: Jay Vosburgh
    Signed-off-by: David S. Miller

    Jay Vosburgh
     

12 Feb, 2016

3 commits

  • Pull GPIO fixes from Linus Walleij:
    - Probe errorpath fix for the Altera
    - irqchip ofnode pointer added to the DaVinci driver
    - controller instance number correction for DaVinci

    * tag 'gpio-v4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
    gpio: davinci: Fix the number of controllers allocated
    gpio: davinci: Add the missing of-node pointer
    gpio: gpio-altera: Remove gpiochip on probe failure.

    Linus Torvalds
     
  • …linux-platform-drivers-x86

    Pull x86 platform driver fixes from Darren Hart:
    "Just two small fixes for the 4.5-rc cycle:

    intel_scu_ipcutil:
    - underflow in scu_reg_access()

    intel-hid:
    - fix incorrect entries in intel_hid_keymap"

    * tag 'platform-drivers-x86-v4.5-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
    intel_scu_ipcutil: underflow in scu_reg_access()
    intel-hid: fix incorrect entries in intel_hid_keymap

    Linus Torvalds
     
  • Pull networking fixes from David Miller:

    1) Fix BPF handling of branch offset adjustmnets on backjumps, from
    Daniel Borkmann.

    2) Make sure selinux knows about SOCK_DESTROY netlink messages, from
    Lorenzo Colitti.

    3) Fix openvswitch tunnel mtu regression, from David Wragg.

    4) Fix ICMP handling of TCP sockets in syn_recv state, from Eric
    Dumazet.

    5) Fix SCTP user hmacid byte ordering bug, from Xin Long.

    6) Fix recursive locking in ipv6 addrconf, from Subash Abhinov
    Kasiviswanathan.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    bpf: fix branch offset adjustment on backjumps after patching ctx expansion
    vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices
    geneve: Relax MTU constraints
    vxlan: Relax MTU constraints
    flow_dissector: Fix unaligned access in __skb_flow_dissector when used by eth_get_headlen
    of: of_mdio: Add marvell, 88e1145 to whitelist of PHY compatibilities.
    selinux: nlmsgtab: add SOCK_DESTROY to the netlink mapping tables
    sctp: translate network order to host order when users get a hmacid
    enic: increment devcmd2 result ring in case of timeout
    tg3: Fix for tg3 transmit queue 0 timed out when too many gso_segs
    net:Add sysctl_max_skb_frags
    tcp: do not drop syn_recv on all icmp reports
    ipv6: fix a lockdep splat
    unix: correctly track in-flight fds in sending process user_struct
    update be2net maintainers' email addresses
    dwc_eth_qos: Reset hardware before PHY start
    ipv6: addrconf: Fix recursive spin lock call

    Linus Torvalds
     

11 Feb, 2016

8 commits

  • Pull rdma fixes from Doug Ledford:
    "A few more minor fixes for rc3:

    - One fix to ipoib
    - One fix to core sysfs code
    - Four patches that resolve an oops found in testing of ocrdma and a
    couple other ocrdma issues"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
    RDMA/ocrdma: Fixing ocrdma debugfs directory remove
    RDMA/ocrdma: Fix pkey_index returned by driver in rq work completion
    RDMA/ocrdma: populate max_sge_rd in device attributes
    RDMA/ocrdma: Initialize stats resources in the driver before ib device registration.
    IB/sysfs: remove unused va_list args
    IB/IPoIB: Do not set skb truesize since using one linearskb

    Linus Torvalds
     
  • When ctx access is used, the kernel often needs to expand/rewrite
    instructions, so after that patching, branch offsets have to be
    adjusted for both forward and backward jumps in the new eBPF program,
    but for backward jumps it fails to account the delta. Meaning, for
    example, if the expansion happens exactly on the insn that sits at
    the jump target, it doesn't fix up the back jump offset.

    Analysis on what the check in adjust_branches() is currently doing:

    /* adjust offset of jmps if necessary */
    if (i < pos && i + insn->off + 1 > pos)
    insn->off += delta;
    else if (i > pos && i + insn->off + 1 < pos)
    insn->off -= delta;

    First condition (forward jumps):

    Before: After:

    insns[0] insns[0]
    insns[1] off + 1 == pos, means we jump to that newly patched
    instruction, so no offset adjustment are needed. That part is correct.

    Second condition (backward jumps):

    Before: After:

    insns[0] insns[0]
    insns[1] pos is okay only by itself. However, i +
    insn->off + 1 < pos does not always work as intended to trigger the
    adjustment. It works when jump targets would be far off where the
    delta wouldn't matter. But, for example, where the fixed insn->off
    before pointed to pos (target_Y), it now points to pos + delta, so
    that additional room needs to be taken into account for the check.
    This means that i) both tests here need to be adjusted into pos + delta,
    and ii) for the second condition, the test needs to be
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Pull input updates from Dmitry Torokhov:
    "Just small driver fixups"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: colibri-vf50-ts - add missing #include
    Input: adp5589 - fix row 5 handling for adp5589
    Input: edt-ft5x06 - fix setting gain, offset, and threshold via device tree
    Input: vmmouse - fix absolute device registration
    Input: serio - drop warnings in case of EPROBE_DEFER from serio_find_driver()
    Input: cap11xx - add missing of_node_put
    Input: sirfsoc-onkey - allow modular build
    Input: xpad - remove unused function

    Linus Torvalds
     
  • Pull libata fixes from Tejun Heo:

    - PORTS_IMPL workaround for very early ahci controllers is misbehaving
    on new systems. Disabled on recent ahci versions.

    - Old-style PIO state machine had a horrible locking problem. Don't
    know how we've been getting away this far. Fixed.

    - Other device specific updates.

    * 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
    ahci: Intel DNV device IDs SATA
    libata: fix sff host state machine locking while polling
    libata-sff: use WARN instead of BUG on illegal host state machine state
    libata: disable forced PORTS_IMPL for >= AHCI 1.3
    libata: blacklist a Viking flash model for MWDMA corruption
    drivers: ata: wake port before DMA stop for ALPM

    Linus Torvalds
     
  • Pull cgroup fixes from Tejun Heo:

    - The destruction path of cgroup objects are asynchronous and
    multi-staged and some of them ended up destroying parents before
    children leading to failures in cpu and memory controllers. Ensure
    that parents are always destroyed after children.

    - cpuset mm node migration was performed synchronously while holding
    threadgroup and cgroup mutexes and the recent threadgroup locking
    update resulted in a possible deadlock. The migration is best effort
    and shouldn't have been performed under those locks to begin with.
    Made asynchronous.

    - Minor documentation fix.

    * 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
    Documentation: cgroup: Fix 'cgroup-legacy' -> 'cgroup-v1'
    cgroup: make sure a parent css isn't freed before its children
    cgroup: make sure a parent css isn't offlined before its children
    cpuset: make mm migration asynchronous

    Linus Torvalds
     
  • Pull workqueue fixes from Tejun Heo:
    "Workqueue fixes for v4.5-rc3.

    - Remove a spurious triggering of flush dependency warning.

    - Officially break local execution guarantee of unbound work items
    and add a debug feature to flush out usages which depend on it.

    - Work around CPU -> NODE mapping becoming invalid on CPU offline.

    The branch is young but pushing out early as stable kernels are being
    affected"

    * 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup
    workqueue: implement "workqueue.debug_force_rr_cpu" debug feature
    workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs
    Revert "workqueue: make sure delayed work run in local cpu"
    workqueue: skip flush dependency checks for legacy workqueues

    Linus Torvalds
     
  • When looking up the pool_workqueue to use for an unbound workqueue,
    workqueue assumes that the target CPU is always bound to a valid NUMA
    node. However, currently, when a CPU goes offline, the mapping is
    destroyed and cpu_to_node() returns NUMA_NO_NODE.

    This has always been broken but hasn't triggered often enough before
    874bbfe600a6 ("workqueue: make sure delayed work run in local cpu").
    After the commit, workqueue forcifully assigns the local CPU for
    delayed work items without explicit target CPU to fix a different
    issue. This widens the window where CPU can go offline while a
    delayed work item is pending causing delayed work items dispatched
    with target CPU set to an already offlined CPU. The resulting
    NUMA_NO_NODE mapping makes workqueue try to queue the work item on a
    NULL pool_workqueue and thus crash.

    While 874bbfe600a6 has been reverted for a different reason making the
    bug less visible again, it can still happen. Fix it by mapping
    NUMA_NO_NODE to the default pool_workqueue from unbound_pwq_by_node().
    This is a temporary workaround. The long term solution is keeping CPU
    -> NODE mapping stable across CPU off/online cycles which is being
    worked on.

    Signed-off-by: Tejun Heo
    Reported-by: Mike Galbraith
    Cc: Tang Chen
    Cc: Rafael J. Wysocki
    Cc: Len Brown
    Cc: stable@vger.kernel.org
    Link: http://lkml.kernel.org/g/1454424264.11183.46.camel@gmail.com
    Link: http://lkml.kernel.org/g/1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.com

    Tejun Heo
     
  • Adding Intel codename DNV platform device IDs for SATA.

    Signed-off-by: Alexandra Yates
    Signed-off-by: Tejun Heo
    Cc: stable@vger.kernel.org

    Alexandra Yates
     

10 Feb, 2016

9 commits

  • David Wragg says:

    ====================
    Set a large MTU on ovs-created tunnel devices

    Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
    transmit vxlan packets of any size, constrained only by the ability to
    send out the resulting packets. 4.3 introduced netdevs corresponding
    to tunnel vports. These netdevs have an MTU, which limits the size of
    a packet that can be successfully encapsulated. The default MTU
    values are low (1500 or less), which is awkwardly small in the context
    of physical networks supporting jumbo frames, and leads to a
    conspicuous change in behaviour for userspace.

    This patch series sets the MTU on openvswitch-created netdevs to be
    the relevant maximum (i.e. the maximum IP packet size minus any
    relevant overhead), effectively restoring the behaviour prior to 4.3.

    Where relevant, the limits on MTU values that can be directly set on
    the netdevs are also relaxed.

    Changes in v2:
    * Extend to all openvswitch tunnel types, i.e. gre and geneve as well
    * Use IP_MAX_MTU

    Changes in v3:
    * Fix block comment style
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
    transmit vxlan packets of any size, constrained only by the ability to
    send out the resulting packets. 4.3 introduced netdevs corresponding
    to tunnel vports. These netdevs have an MTU, which limits the size of
    a packet that can be successfully encapsulated. The default MTU
    values are low (1500 or less), which is awkwardly small in the context
    of physical networks supporting jumbo frames, and leads to a
    conspicuous change in behaviour for userspace.

    Instead, set the MTU on openvswitch-created netdevs to be the relevant
    maximum (i.e. the maximum IP packet size minus any relevant overhead),
    effectively restoring the behaviour prior to 4.3.

    Signed-off-by: David Wragg
    Signed-off-by: David S. Miller

    David Wragg
     
  • Allow the MTU of geneve devices to be set to large values, in order to
    exploit underlying networks with larger frame sizes.

    GENEVE does not have a fixed encapsulation overhead (an openvswitch
    rule can add variable length options), so there is no relevant maximum
    MTU to enforce. A maximum of IP_MAX_MTU is used instead.
    Encapsulated packets that are too big for the underlying network will
    get dropped on the floor.

    Signed-off-by: David Wragg
    Signed-off-by: David S. Miller

    David Wragg
     
  • Allow the MTU of vxlan devices without an underlying device to be set
    to larger values (up to a maximum based on IP packet limits and vxlan
    overhead).

    Previously, their MTUs could not be set to higher than the
    conventional ethernet value of 1500. This is a very arbitrary value
    in the context of vxlan, and prevented vxlan devices from being able
    to take advantage of jumbo frames etc.

    The default MTU remains 1500, for compatibility.

    Signed-off-by: David Wragg
    Acked-by: Roopa Prabhu
    Signed-off-by: David S. Miller

    David Wragg
     
  • Driver only needs to allocate for [ngpio / 32] controllers,
    as each controller handles 32 gpios. But the current driver
    allocates for ngpio of which the extra allocated are unused.
    Fix it be registering only the required number of controllers.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Keerthy
    Reviewed-by: Grygorii Strashko
    Signed-off-by: Linus Walleij

    Lokesh Vutla
     
  • Currently the first parameter of irq_domain_add_legacy is NULL.
    irq_find_host function returns NULL when we do not populate the of_node
    and hence irq_of_parse_and_map call fails whenever we want to request a
    gpio irq. This fixes the request_irq failures for gpio interrupts.

    Signed-off-by: Keerthy
    Reviewed-by: Grygorii Strashko
    Signed-off-by: Linus Walleij

    Keerthy
     
  • Pull module fixes from Rusty Russell:
    "Fix for async_probe module param added in 4.3 (clearly not widely used
    yet), and a much more interesting kallsyms race which has been around
    approximately forever. This fix is more invasive, and will require
    some care in backporting, but I hated all the bandaids I could think
    of, so...

    There are some more coming, which are only for breakages introduced
    this cycle (livepatch), but wanted these in now"

    * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
    modules: fix longstanding /proc/kallsyms vs module insertion race.
    module: wrapper for symbol name.
    modules: fix modparam async_probe request

    Linus Torvalds
     
  • drivers/input/touchscreen/colibri-vf50-ts.c: In function ‘vf50_ts_probe’:
    drivers/input/touchscreen/colibri-vf50-ts.c:302: error: implicit declaration of function ‘of_property_read_u32’

    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: Dmitry Torokhov

    Geert Uytterhoeven
     
  • The adp5589 has row 5, don't skip it when creating the GPIO mapping.
    Otherwise the pin gets reserved as used and it is not possible to use it as
    a GPIO.

    Signed-off-by: Lars-Peter Clausen
    Acked-by: Michael Hennerich
    Signed-off-by: Dmitry Torokhov

    Lars-Peter Clausen