03 Dec, 2016

8 commits

  • segs needs to be checked for being NULL in ipv6_gso_segment() before calling
    skb_shinfo(segs), otherwise kernel can run into a NULL-pointer dereference:

    [ 97.811262] BUG: unable to handle kernel NULL pointer dereference at 00000000000000cc
    [ 97.819112] IP: [] ipv6_gso_segment+0x119/0x2f0
    [ 97.825214] PGD 0 [ 97.827047]
    [ 97.828540] Oops: 0000 [#1] SMP
    [ 97.831678] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 rpcsec_gss_krb5
    nfsv4 dns_resolver nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
    iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack
    ipt_REJECT nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter
    bridge stp llc snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel
    snd_hda_codec edac_mce_amd snd_hda_core edac_core snd_hwdep kvm_amd snd_seq kvm snd_seq_device
    snd_pcm irqbypass snd_timer ppdev parport_serial snd parport_pc k10temp pcspkr soundcore parport
    sp5100_tco shpchp sg wmi i2c_piix4 acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc
    ip_tables xfs libcrc32c sr_mod cdrom sd_mod ata_generic pata_acpi amdkfd amd_iommu_v2 radeon
    broadcom bcm_phy_lib i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
    ttm ahci serio_raw tg3 firewire_ohci libahci pata_atiixp drm ptp libata firewire_core pps_core
    i2c_core crc_itu_t fjes dm_mirror dm_region_hash dm_log dm_mod
    [ 97.927721] CPU: 1 PID: 3504 Comm: vhost-3495 Not tainted 4.9.0-7.el7.test.x86_64 #1
    [ 97.935457] Hardware name: AMD Snook/Snook, BIOS ESK0726A 07/26/2010
    [ 97.941806] task: ffff880129a1c080 task.stack: ffffc90001bcc000
    [ 97.947720] RIP: 0010:[] [] ipv6_gso_segment+0x119/0x2f0
    [ 97.956251] RSP: 0018:ffff88012fc43a10 EFLAGS: 00010207
    [ 97.961557] RAX: 0000000000000000 RBX: ffff8801292c8700 RCX: 0000000000000594
    [ 97.968687] RDX: 0000000000000593 RSI: ffff880129a846c0 RDI: 0000000000240000
    [ 97.975814] RBP: ffff88012fc43a68 R08: ffff880129a8404e R09: 0000000000000000
    [ 97.982942] R10: 0000000000000000 R11: ffff880129a84076 R12: 00000020002949b3
    [ 97.990070] R13: ffff88012a580000 R14: 0000000000000000 R15: ffff88012a580000
    [ 97.997198] FS: 0000000000000000(0000) GS:ffff88012fc40000(0000) knlGS:0000000000000000
    [ 98.005280] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 98.011021] CR2: 00000000000000cc CR3: 0000000126c5d000 CR4: 00000000000006e0
    [ 98.018149] Stack:
    [ 98.020157] 00000000ffffffff ffff88012fc43ac8 ffffffffa017ad0a 000000000000000e
    [ 98.027584] 0000001300000000 0000000077d59998 ffff8801292c8700 00000020002949b3
    [ 98.035010] ffff88012a580000 0000000000000000 ffff88012a580000 ffff88012fc43a98
    [ 98.042437] Call Trace:
    [ 98.044879] [ 98.046803] [] ? tg3_start_xmit+0x84a/0xd60 [tg3]
    [ 98.053156] [] skb_mac_gso_segment+0xb0/0x130
    [ 98.059158] [] __skb_gso_segment+0x73/0x110
    [ 98.064985] [] validate_xmit_skb+0x12d/0x2b0
    [ 98.070899] [] validate_xmit_skb_list+0x42/0x70
    [ 98.077073] [] sch_direct_xmit+0xd0/0x1b0
    [ 98.082726] [] __dev_queue_xmit+0x486/0x690
    [ 98.088554] [] ? cpumask_next_and+0x35/0x50
    [ 98.094380] [] dev_queue_xmit+0x10/0x20
    [ 98.099863] [] br_dev_queue_push_xmit+0xa7/0x170 [bridge]
    [ 98.106907] [] br_forward_finish+0x41/0xc0 [bridge]
    [ 98.113430] [] ? nf_iterate+0x52/0x60
    [ 98.118735] [] ? nf_hook_slow+0x6b/0xc0
    [ 98.124216] [] __br_forward+0x14c/0x1e0 [bridge]
    [ 98.130480] [] ? br_dev_queue_push_xmit+0x170/0x170 [bridge]
    [ 98.137785] [] br_forward+0x9d/0xb0 [bridge]
    [ 98.143701] [] br_handle_frame_finish+0x267/0x560 [bridge]
    [ 98.150834] [] br_handle_frame+0x174/0x2f0 [bridge]
    [ 98.157355] [] ? sched_clock+0x9/0x10
    [ 98.162662] [] ? sched_clock_cpu+0x72/0xa0
    [ 98.168403] [] __netif_receive_skb_core+0x1e5/0xa20
    [ 98.174926] [] ? timerqueue_add+0x59/0xb0
    [ 98.180580] [] __netif_receive_skb+0x18/0x60
    [ 98.186494] [] process_backlog+0x95/0x140
    [ 98.192145] [] net_rx_action+0x16d/0x380
    [ 98.197713] [] __do_softirq+0xd1/0x283
    [ 98.203106] [] do_softirq_own_stack+0x1c/0x30
    [ 98.209107] [ 98.211029] [] do_softirq+0x50/0x60
    [ 98.216166] [] netif_rx_ni+0x33/0x80
    [ 98.221386] [] tun_get_user+0x487/0x7f0 [tun]
    [ 98.227388] [] tun_sendmsg+0x4b/0x60 [tun]
    [ 98.233129] [] handle_tx+0x282/0x540 [vhost_net]
    [ 98.239392] [] handle_tx_kick+0x15/0x20 [vhost_net]
    [ 98.245916] [] vhost_worker+0x9e/0xf0 [vhost]
    [ 98.251919] [] ? vhost_umem_alloc+0x40/0x40 [vhost]
    [ 98.258440] [] ? do_syscall_64+0x67/0x180
    [ 98.264094] [] kthread+0xd9/0xf0
    [ 98.268965] [] ? kthread_park+0x60/0x60
    [ 98.274444] [] ret_from_fork+0x25/0x30
    [ 98.279836] Code: 8b 93 d8 00 00 00 48 2b 93 d0 00 00 00 4c 89 e6 48 89 df 66 89 93 c2 00 00 00 ff 10 48 3d 00 f0 ff ff 49 89 c2 0f 87 52 01 00 00 8b 92 cc 00 00 00 48 8b 80 d0 00 00 00 44 0f b7 74 10 06 66
    [ 98.299425] RIP [] ipv6_gso_segment+0x119/0x2f0
    [ 98.305612] RSP
    [ 98.309094] CR2: 00000000000000cc
    [ 98.312406] ---[ end trace 726a2c7a2d2d78d0 ]---

    Signed-off-by: Artem Savkov
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Artem Savkov
     
  • If some error is encountered in rds_tcp_init_net, make sure to
    unregister_netdevice_notifier(), else we could trigger a panic
    later on, when the modprobe from a netns fails.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     
  • This reverts commit ae148b085876fa771d9ef2c05f85d4b4bf09ce0d
    ("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()").

    skb->protocol is now set in __ip_local_out() and __ip6_local_out() before
    dst_output() is called. It is no longer necessary to do it for each tunnel.

    Cc: stable@vger.kernel.org
    Signed-off-by: Eli Cooper
    Signed-off-by: David S. Miller

    Eli Cooper
     
  • When xfrm is applied to TSO/GSO packets, it follows this path:

    xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()

    where skb_gso_segment() relies on skb->protocol to function properly.

    This patch sets skb->protocol to ETH_P_IPV6 before dst_output() is called,
    fixing a bug where GSO packets sent through an ipip6 tunnel are dropped
    when xfrm is involved.

    Cc: stable@vger.kernel.org
    Signed-off-by: Eli Cooper
    Signed-off-by: David S. Miller

    Eli Cooper
     
  • When xfrm is applied to TSO/GSO packets, it follows this path:

    xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()

    where skb_gso_segment() relies on skb->protocol to function properly.

    This patch sets skb->protocol to ETH_P_IP before dst_output() is called,
    fixing a bug where GSO packets sent through a sit tunnel are dropped
    when xfrm is involved.

    Cc: stable@vger.kernel.org
    Signed-off-by: Eli Cooper
    Signed-off-by: David S. Miller

    Eli Cooper
     
  • When packet_set_ring creates a ring buffer it will initialize a
    struct timer_list if the packet version is TPACKET_V3. This value
    can then be raced by a different thread calling setsockopt to
    set the version to TPACKET_V1 before packet_set_ring has finished.

    This leads to a use-after-free on a function pointer in the
    struct timer_list when the socket is closed as the previously
    initialized timer will not be deleted.

    The bug is fixed by taking lock_sock(sk) in packet_setsockopt when
    changing the packet version while also taking the lock at the start
    of packet_set_ring.

    Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.")
    Signed-off-by: Philip Pettersson
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Philip Pettersson
     
  • The driver already uses its private lock for synchronization between xmit
    and xmit completion handler making the additional use of the xmit_lock
    unnecessary.
    Furthermore the driver does not set NETIF_F_LLTX resulting in xmit to be
    called with the xmit_lock held and then taking the private lock while xmit
    completion handler does the reverse, first take the private lock, then the
    xmit_lock.
    Fix these issues by not taking the xmit_lock in the tx completion handler.

    Signed-off-by: Lino Sanfilippo
    Signed-off-by: David S. Miller

    Lino Sanfilippo
     
  • An explicit dma sync for device directly after mapping as well as an
    explicit dma sync for cpu directly before unmapping is unnecessary and
    costly on the hotpath. So remove these calls.

    Signed-off-by: Lino Sanfilippo
    Signed-off-by: David S. Miller

    Lino Sanfilippo
     

02 Dec, 2016

14 commits

  • Johan Hovold says:

    ====================
    net: stmmac: fix probe error handling and phydev leaks

    This series fixes a number of issues with the stmmac-driver probe error
    handling, which for example left clocks enabled after probe failures.

    The final patch fixes a failure to deregister and free any fixed-link
    PHYs that were registered during probe on probe errors and on driver
    unbind. It also fixes a related of-node leak on late probe errors.

    This series depends on the of_phy_deregister_fixed_link() helper that
    was just merged to net.

    As mentioned earlier, one staging driver also suffers from a similar
    leak and can be fixed up once the above mentioned helper hits mainline.

    Note that these patches have only been compile tested.
    ====================

    Acked-by: Giuseppe Cavallaro
    Signed-off-by: David S. Miller

    David S. Miller
     
  • Make sure to deregister and free any fixed-link phy registered during
    probe on probe errors and on driver unbind by adding a new glue helper
    function.

    Drop the of-node reference taken in the same path also on late probe
    errors (and not just on driver unbind) by moving the put from
    stmmac_dvr_remove() to the new helper.

    Fixes: 277323814e49 ("stmmac: add fixed-link device-tree support")
    Fixes: 4613b279bee7 ("ethernet: stmicro: stmmac: add missing of_node_put
    after calling of_parse_phandle")
    Signed-off-by: Johan Hovold
    Acked-by: Maxime Ripard
    Signed-off-by: David S. Miller

    Johan Hovold
     
  • Fix the OF-helper function header to reflect that the function no longer
    has a platform-data parameter.

    Fixes: b0003ead75f3 ("stmmac: make stmmac_probe_config_dt return the
    platform data struct")
    Signed-off-by: Johan Hovold
    Signed-off-by: David S. Miller

    Johan Hovold
     
  • Make sure to disable clocks before returning on late probe errors.

    Fixes: 566e82516253 ("net: stmmac: add a glue driver for the Amlogic
    Meson 8b / GXBB DWMAC")
    Signed-off-by: Johan Hovold
    Acked-by: Kevin Hilman
    Signed-off-by: David S. Miller

    Johan Hovold
     
  • Make sure to call any exit() callback to undo the effect of init()
    before returning on late probe errors.

    Fixes: cf3f047b9af4 ("stmmac: move hw init in the probe (v2)")
    Signed-off-by: Johan Hovold
    Signed-off-by: David S. Miller

    Johan Hovold
     
  • Make sure to disable runtime PM, power down the PHY, and disable clocks
    before returning on late probe errors.

    Fixes: 27ffefd2d109 ("stmmac: dwmac-rk: create a new probe function")
    Signed-off-by: Johan Hovold
    Signed-off-by: David S. Miller

    Johan Hovold
     
  • Make sure to disable clocks before returning on late probe errors.

    Fixes: 8387ee21f972 ("stmmac: dwmac-sti: turn setup callback into a
    probe function")
    Signed-off-by: Johan Hovold
    Signed-off-by: David S. Miller

    Johan Hovold
     
  • Make sure to call stmmac_dvr_remove() before returning on late probe
    errors so that memory is freed, clocks are disabled, and the netdev is
    deregistered before its resources go away.

    Fixes: 3c201b5a84ed ("net: stmmac: socfpga: Remove re-registration of
    reset controller")
    Signed-off-by: Johan Hovold
    Signed-off-by: David S. Miller

    Johan Hovold
     
  • Use the correct attribute constant names IFLA_GSO_MAX_{SEGS,SIZE}
    instead of IFLA_MAX_GSO_{SEGS,SIZE} for the comments int nlmsg_size().

    Cc: Eric Dumazet
    Signed-off-by: Tobias Klauser
    Signed-off-by: David S. Miller

    Tobias Klauser
     
  • In the case of IPIP and SIT tunnel frames the outer transport header
    offset is actually set to the same offset as the inner transport header.
    This results in the lco_csum call not doing any checksum computation over
    the inner IPv4/v6 header data.

    In order to account for that I am updating the code so that we determine
    the location to start the checksum ourselves based on the location of the
    IPv4 header and the length.

    Fixes: b83e30104bd9 ("ixgbe/ixgbevf: Add support for GSO partial")
    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     
  • In the case of IPIP and SIT tunnel frames the outer transport header
    offset is actually set to the same offset as the inner transport header.
    This results in the lco_csum call not doing any checksum computation over
    the inner IPv4/v6 header data.

    In order to account for that I am updating the code so that we determine
    the location to start the checksum ourselves based on the location of the
    IPv4 header and the length.

    Fixes: e10715d3e961 ("igb/igbvf: Add support for GSO partial")
    Reported-by: Stephen Rothwell
    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     
  • The change fixes AX88772_suspend() USB vendor commands failure issues.

    Signed-off-by: Allan Chou
    Tested-by: Allan Chou
    Tested-by: Jon Hunter
    Signed-off-by: David S. Miller

    allan
     
  • Steffen Klassert says:

    ====================
    pull request (net): ipsec 2016-12-01

    1) Change the error value when someone tries to run 32bit
    userspace on a 64bit host from -ENOTSUPP to the userspace
    exported -EOPNOTSUPP. Fix from Yi Zhao.

    2) On inbound, ESN sequence numbers are already in network
    byte order. So don't try to convert it again, this fixes
    integrity verification for ESN. Fixes from Tobias Brunner.

    Please pull or let me know if there are problems.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Pablo Neira Ayuso says:

    ====================
    Netfilter fixes for net

    This is a large batch of Netfilter fixes for net, they are:

    1) Three patches to fix NAT conversion to rhashtable: Switch to rhlist
    structure that allows to have several objects with the same key.
    Moreover, fix wrong comparison logic in nf_nat_bysource_cmp() as this is
    expecting a return value similar to memcmp(). Change location of
    the nat_bysource field in the nf_conn structure to avoid zeroing
    this as it breaks interaction with SLAB_DESTROY_BY_RCU and lead us
    to crashes. From Florian Westphal.

    2) Don't allow malformed fragments go through in IPv6, drop them,
    otherwise we hit GPF, patch from Florian Westphal.

    3) Fix crash if attributes are missing in nft_range, from Liping Zhang.

    4) Fix arptables 32-bits userspace 64-bits kernel compat, from Hongxu Jia.

    5) Two patches from David Ahern to fix netfilter interaction with vrf.
    From David Ahern.

    6) Fix element timeout calculation in nf_tables, we take milliseconds
    from userspace, but we use jiffies from kernelspace. Patch from
    Anders K. Pedersen.

    7) Missing validation length netlink attribute for nft_hash, from
    Laura Garcia.

    8) Fix nf_conntrack_helper documentation, we don't default to off
    anymore for a bit of time so let's get this in sync with the code.

    I know is late but I think these are important, specifically the NAT
    bits, as they are mostly addressing fallout from recent changes. I also
    read there are chances to have -rc8, if that is the case, that would
    also give us a bit more time to test this.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

01 Dec, 2016

16 commits

  • We trigger uarg->callback() immediately after we decide do datacopy
    even if caller want to do zerocopy. This will cause the callback
    (vhost_net_zerocopy_callback) decrease the refcount. But when we meet
    an error afterwards, the error handling in vhost handle_tx() will try
    to decrease it again. This is wrong and fix this by delay the
    uarg->callback() until we're sure there's no errors.

    Signed-off-by: Jason Wang
    Acked-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Jason Wang
     
  • We trigger uarg->callback() immediately after we decide do datacopy
    even if caller want to do zerocopy. This will cause the callback
    (vhost_net_zerocopy_callback) decrease the refcount. But when we meet
    an error afterwards, the error handling in vhost handle_tx() will try
    to decrease it again. This is wrong and fix this by delay the
    uarg->callback() until we're sure there's no errors.

    Reported-by: wangyunjian
    Signed-off-by: Jason Wang
    Acked-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Jason Wang
     
  • netif_set_real_num_tx/rx_queues() are required to be called with rtnl_lock
    taken, otherwise ASSERT_RTNL() warning will be triggered - which happens
    now during System resume from suspend:
    cpsw_resume()
    |- cpsw_ndo_open()
    |- netif_set_real_num_tx/rx_queues()
    |- ASSERT_RTNL();

    Hence, fix it by surrounding cpsw_ndo_open() by rtnl_lock/unlock() calls.

    Cc: Dave Gerlach
    Cc: Ivan Khoronzhuk
    Fixes: commit e05107e6b747 ("net: ethernet: ti: cpsw: add multi queue support")
    Signed-off-by: Grygorii Strashko
    Reviewed-by: Ivan Khoronzhuk
    Tested-by: Dave Gerlach
    Signed-off-by: David S. Miller

    Grygorii Strashko
     
  • If we have a branch that looks something like this

    int foo = map->value;
    if (condition) {
    foo += blah;
    } else {
    foo = bar;
    }
    map->array[foo] = baz;

    We will incorrectly assume that the !condition branch is equal to the condition
    branch as the register for foo will be UNKNOWN_VALUE in both cases. We need to
    adjust this logic to only do this if we didn't do a varlen access after we
    processed the !condition branch, otherwise we have different ranges and need to
    check the other branch as well.

    Fixes: 484611357c19 ("bpf: allow access into map value arrays")
    Reported-by: Jann Horn
    Signed-off-by: Josef Bacik
    Acked-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Josef Bacik
     
  • Since 09d9686047db ("netfilter: x_tables: do compat validation via
    translate_table"), it used compatr structure to assign newinfo
    structure. In translate_compat_table of ip_tables.c and ip6_tables.c,
    it used compatr->hook_entry to replace info->hook_entry and
    compatr->underflow to replace info->underflow, but not do the same
    replacement in arp_tables.c.

    It caused invoking 32-bit "arptbale -P INPUT ACCEPT" failed in 64bit
    kernel.
    --------------------------------------
    root@qemux86-64:~# arptables -P INPUT ACCEPT
    root@qemux86-64:~# arptables -P INPUT ACCEPT
    ERROR: Policy for `INPUT' offset 448 != underflow 0
    arptables: Incompatible with this kernel
    --------------------------------------

    Fixes: 09d9686047db ("netfilter: x_tables: do compat validation via translate_table")
    Signed-off-by: Hongxu Jia
    Acked-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Hongxu Jia
     
  • …m/linux/kernel/git/kvalo/wireless-drivers

    Kalle Valo says:

    ====================
    wireless-drivers fixes for 4.9

    mwifiex

    * properly terminate SSIDs so that uninitalised memory is not printed
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • Guillaume Nault says:

    ====================
    l2tp: fixes for l2tp_ip and l2tp_ip6 socket handling

    This series addresses problems found while working on commit 32c231164b76
    ("l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()").

    The first three patches fix races in socket's connect, recv and bind
    operations. The last two ones fix scenarios where l2tp fails to
    correctly lookup its userspace sockets.

    Apart from the last patch, which is l2tp_ip6 specific, every patch
    fixes the same problem in the L2TP IPv4 and IPv6 code.

    All problems fixed by this series exist since the creation of the
    l2tp_ip and l2tp_ip6 modules.

    Changes since v1:
    * Patch #3: fix possible uninitialised use of 'ret' in l2tp_ip_bind().
    ====================

    Acked-by: James Chapman

    David S. Miller
     
  • The '!(addr && ipv6_addr_equal(addr, laddr))' part of the conditional
    matches if addr is NULL or if addr != laddr.
    But the intend of __l2tp_ip6_bind_lookup() is to find a sockets with
    the same address, so the ipv6_addr_equal() condition needs to be
    inverted.

    For better clarity and consistency with the rest of the expression, the
    (!X || X == Y) notation is used instead of !(X && X != Y).

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • When looking up an l2tp socket, we must consider a null netdevice id as
    wild card. There are currently two problems caused by
    __l2tp_ip_bind_lookup() not considering 'dif' as wild card when set to 0:

    * A socket bound to a device (i.e. with sk->sk_bound_dev_if != 0)
    never receives any packet. Since __l2tp_ip_bind_lookup() is called
    with dif == 0 in l2tp_ip_recv(), sk->sk_bound_dev_if is always
    different from 'dif' so the socket doesn't match.

    * Two sockets, one bound to a device but not the other, can be bound
    to the same address. If the first socket binding to the address is
    the one that is also bound to a device, the second socket can bind
    to the same address without __l2tp_ip_bind_lookup() noticing the
    overlap.

    To fix this issue, we need to consider that any null device index, be
    it 'sk->sk_bound_dev_if' or 'dif', matches with any other value.
    We also need to pass the input device index to __l2tp_ip_bind_lookup()
    on reception so that sockets bound to a device never receive packets
    from other devices.

    This patch fixes l2tp_ip6 in the same way.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • It's not enough to check for sockets bound to same address at the
    beginning of l2tp_ip{,6}_bind(): even if no socket is found at that
    time, a socket with the same address could be bound before we take
    the l2tp lock again.

    This patch moves the lookup right before inserting the new socket, so
    that no change can ever happen to the list between address lookup and
    socket insertion.

    Care is taken to avoid side effects on the socket in case of failure.
    That is, modifications of the socket are done after the lookup, when
    binding is guaranteed to succeed, and before releasing the l2tp lock,
    so that concurrent lookups will always see fully initialised sockets.

    For l2tp_ip, 'ret' is set to -EINVAL before checking the SOCK_ZAPPED
    bit. Error code was mistakenly set to -EADDRINUSE on error by commit
    32c231164b76 ("l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()").
    Using -EINVAL restores original behaviour.

    For l2tp_ip6, the lookup is now always done with the correct bound
    device. Before this patch, when binding to a link-local address, the
    lookup was done with the original sk->sk_bound_dev_if, which was later
    overwritten with addr->l2tp_scope_id. Lookup is now performed with the
    final sk->sk_bound_dev_if value.

    Finally, the (addr_len >= sizeof(struct sockaddr_in6)) check has been
    dropped: addr is a sockaddr_l2tpip6 not sockaddr_in6 and addr_len has
    already been checked at this point (this part of the code seems to have
    been copy-pasted from net/ipv6/raw.c).

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • Socket must be held while under the protection of the l2tp lock; there
    is no guarantee that sk remains valid after the read_unlock_bh() call.

    Same issue for l2tp_ip and l2tp_ip6.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • Socket flags aren't updated atomically, so the socket must be locked
    while reading the SOCK_ZAPPED flag.

    This issue exists for both l2tp_ip and l2tp_ip6. For IPv6, this patch
    also brings error handling for __ip6_datagram_connect() failures.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • Signed-off-by: Hariprasad Shenai
    Signed-off-by: David S. Miller

    Hariprasad Shenai
     
  • Execution 'ethtool -S' on fec device that is down causes OOPS on Vybrid
    board:

    Unhandled fault: external abort on non-linefetch (0x1008) at 0xe0898200
    pgd = ddecc000
    [e0898200] *pgd=9e406811, *pte=400d1653, *ppte=400d1453
    Internal error: : 1008 [#1] SMP ARM
    ...

    Reason of OOPS is that fec_enet_get_ethtool_stats() accesses fec
    registers while IPG clock is stopped by PM.

    Fix that by caching statistics in fec_enet_private. Cache is initialized
    at device probe time, and updated at statistics request time if device
    is up, and also just before turning device off on down path.

    Additional locking is not needed, since cached statistics is accessed
    either before device is registered, or under rtnl_lock().

    Signed-off-by: Nikita Yushchenko
    Signed-off-by: David S. Miller

    Nikita Yushchenko
     
  • vxlan_fdb_append may return error, so add the proper check,
    otherwise it will cause memory leak.

    Signed-off-by: Haishuang Yan

    Changes in v2:
    - Unnecessary to initialize rc to zero.
    Acked-by: Jiri Benc
    Signed-off-by: David S. Miller

    Haishuang Yan
     
  • If nf_ct_frag6_gather() returns an error other than -EINPROGRESS, it
    means that we still have a reference to the skb. We should free it
    before returning from handle_fragments, as stated in the comment above.

    Fixes: daaa7d647f81 ("netfilter: ipv6: avoid nf_iterate recursion")
    CC: Florian Westphal
    CC: Pravin B Shelar
    CC: Joe Stringer
    Signed-off-by: Daniele Di Proietto
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Daniele Di Proietto
     

30 Nov, 2016

2 commits