27 Oct, 2017

2 commits


26 Oct, 2017

2 commits

  • Commit 96edd61dcf44362d3ef0bed1a5361e0ac7886a63 ("xen/balloon: don't
    online new memory initially") introduced a regression when booting a
    HVM domain with memory less than mem-max: instead of ballooning down
    immediately the system would try to use the memory up to mem-max
    resulting in Xen crashing the domain.

    For HVM domains the current size will be reflected in Xenstore node
    memory/static-max instead of memory/target.

    Additionally we have to trigger the ballooning process at once.

    Cc: # 4.13
    Fixes: 96edd61dcf44362d3ef0bed1a5361e0ac7886a63 ("xen/balloon: don't
    online new memory initially")

    Reported-by: Simon Gaiser
    Suggested-by: Boris Ostrovsky
    Signed-off-by: Juergen Gross
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Juergen Gross
     
  • In case gntdev_mmap() succeeds only partially in mapping grant pages
    it will leave some vital information uninitialized needed later for
    cleanup. This will lead to an out of bounds array access when unmapping
    the already mapped pages.

    So just initialize the data needed for unmapping the pages a little bit
    earlier.

    Cc:
    Reported-by: Arthur Borsboom
    Signed-off-by: Juergen Gross
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Juergen Gross
     

17 Oct, 2017

1 commit

  • RFC791 specifies the minimum MTU to be 68, while xen-net{front|back}
    drivers use a minimum value of 0.

    When set MTU to 0~67 with xen_net{front|back} driver, the network
    will become unreachable immediately, the guest can no longer be pinged.

    xen_net{front|back} should not allow the user to set this value which causes
    network problems.

    Reported-by: Chen Shi
    Signed-off-by: Mohammed Gamal
    Acked-by: Wei Liu
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Mohammed Gamal
     

10 Oct, 2017

1 commit


28 Sep, 2017

2 commits

  • When bootup a PVM guest with large memory(Ex.240GB), XEN provided initial
    mapping overlaps with kernel module virtual space. When mapping in this space
    is cleared by xen_cleanhighmap(), in certain case there could be an 2MB mapping
    left. This is due to XEN initialize 4MB aligned mapping but xen_cleanhighmap()
    finish at 2MB boundary.

    When module loading is just on top of the 2MB space, got below warning:

    WARNING: at mm/vmalloc.c:106 vmap_pte_range+0x14e/0x190()
    Call Trace:
    [] warn_alloc_failed+0xf3/0x160
    [] __vmalloc_area_node+0x182/0x1c0
    [] ? module_alloc_update_bounds+0x1e/0x80
    [] __vmalloc_node_range+0xa7/0x110
    [] ? module_alloc_update_bounds+0x1e/0x80
    [] module_alloc+0x64/0x70
    [] ? module_alloc_update_bounds+0x1e/0x80
    [] module_alloc_update_bounds+0x1e/0x80
    [] move_module+0x27/0x150
    [] layout_and_allocate+0x120/0x1b0
    [] load_module+0x78/0x640
    [] ? security_file_permission+0x8b/0x90
    [] sys_init_module+0x62/0x1e0
    [] system_call_fastpath+0x16/0x1b

    Then the mapping of 2MB is cleared, finally oops when the page in that space is
    accessed.

    BUG: unable to handle kernel paging request at ffff880022600000
    IP: [] clear_page_c_e+0x7/0x10
    PGD 1788067 PUD 178c067 PMD 22434067 PTE 0
    Oops: 0002 [#1] SMP
    Call Trace:
    [] ? prep_new_page+0x127/0x1c0
    [] get_page_from_freelist+0x1e2/0x550
    [] ? ii_iovec_copy_to_user+0x90/0x140
    [] __alloc_pages_nodemask+0x12d/0x230
    [] alloc_pages_vma+0xc6/0x1a0
    [] ? pte_mfn_to_pfn+0x7d/0x100
    [] do_anonymous_page+0x16b/0x350
    [] handle_pte_fault+0x1e4/0x200
    [] ? xen_pmd_val+0xe/0x10
    [] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
    [] handle_mm_fault+0x15b/0x270
    [] do_page_fault+0x140/0x470
    [] page_fault+0x25/0x30

    Call xen_cleanhighmap() with 4MB aligned for page tables mapping to fix it.
    The unnecessory call of xen_cleanhighmap() in DEBUG mode is also removed.

    -v2: add comment about XEN alignment from Juergen.

    References: https://lists.xen.org/archives/html/xen-devel/2012-07/msg01562.html
    Signed-off-by: Zhenzhong Duan
    Reviewed-by: Juergen Gross

    [boris: added 'xen/mmu' tag to commit subject]
    Signed-off-by: Boris Ostrovsky

    Zhenzhong Duan
     
  • Just like done in d2bd05d88d ("xen-pciback: return proper values during
    BAR sizing") for the ROM BAR, ordinary ones also shouldn't compare the
    written value directly against ~0, but consider the r/o bits at the
    bottom (if any).

    Signed-off-by: Jan Beulich
    Reviewed-by: Juergen Gross
    Signed-off-by: Boris Ostrovsky

    Jan Beulich
     

22 Sep, 2017

1 commit

  • In the case where sizeof(maddr) != sizeof(long) p is initialized and
    never read and clang throws a warning on this. Move declaration of
    p to clean up the clang build warning:

    warning: Value stored to 'p' during its initialization is never read

    Signed-off-by: Colin Ian King
    Reviewed-by: Juergen Gross
    Signed-off-by: Boris Ostrovsky

    Colin Ian King
     

17 Sep, 2017

9 commits

  • Linus Torvalds
     
  • Pull UBI updates from Richard Weinberger:
    "Minor improvements"

    * tag 'upstream-4.14-rc1' of git://git.infradead.org/linux-ubifs:
    UBI: Fix two typos in comments
    ubi: fastmap: fix spelling mistake: "invalidiate" -> "invalidate"
    ubi: pr_err() strings should end with newlines
    ubi: pr_err() strings should end with newlines
    ubi: pr_err() strings should end with newlines

    Linus Torvalds
     
  • Pull UML updates from Richard Weinberger:

    - minor improvements

    - fixes for Debian's new gcc defaults (pie enabled by default)

    - fixes for XSTATE/XSAVE to make UML work again on modern systems

    * 'for-linus-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
    um: return negative in tuntap_open_tramp()
    um: remove a stray tab
    um: Use relative modversions with LD_SCRIPT_DYN
    um: link vmlinux with -no-pie
    um: Fix CONFIG_GCOV for modules.
    Fix minor typos and grammar in UML start_up help
    um: defconfig: Cleanup from old Kconfig options
    um: Fix FP register size for XSTATE/XSAVE

    Linus Torvalds
     
  • Pull networking fixes from David Miller:

    1) Fix hotplug deadlock in hv_netvsc, from Stephen Hemminger.

    2) Fix double-free in rmnet driver, from Dan Carpenter.

    3) INET connection socket layer can double put request sockets, fix
    from Eric Dumazet.

    4) Don't match collect metadata-mode tunnels if the device is down,
    from Haishuang Yan.

    5) Do not perform TSO6/GSO on ipv6 packets with extensions headers in
    be2net driver, from Suresh Reddy.

    6) Fix scaling error in gen_estimator, from Eric Dumazet.

    7) Fix 64-bit statistics deadlock in systemport driver, from Florian
    Fainelli.

    8) Fix use-after-free in sctp_sock_dump, from Xin Long.

    9) Reject invalid BPF_END instructions in verifier, from Edward Cree.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
    mlxsw: spectrum_router: Only handle IPv4 and IPv6 events
    Documentation: link in networking docs
    tcp: fix data delivery rate
    bpf/verifier: reject BPF_ALU64|BPF_END
    sctp: do not mark sk dumped when inet_sctp_diag_fill returns err
    sctp: fix an use-after-free issue in sctp_sock_dump
    netvsc: increase default receive buffer size
    tcp: update skb->skb_mstamp more carefully
    net: ipv4: fix l3slave check for index returned in IP_PKTINFO
    net: smsc911x: Quieten netif during suspend
    net: systemport: Fix 64-bit stats deadlock
    net: vrf: avoid gcc-4.6 warning
    qed: remove unnecessary call to memset
    tg3: clean up redundant initialization of tnapi
    tls: make tls_sw_free_resources static
    sctp: potential read out of bounds in sctp_ulpevent_type_enabled()
    MAINTAINERS: review Renesas DT bindings as well
    net_sched: gen_estimator: fix scaling error in bytes/packets samples
    nfp: wait for the NSP resource to appear on boot
    nfp: wait for board state before talking to the NSP
    ...

    Linus Torvalds
     
  • Pull more input updates from Dmitry Torokhov:
    "A second round of updates for the input subsystem:

    - a new driver for PWM-controlled vibrators

    - ucb1400 touchscreen driver had completely busted suspend/resume
    handling

    - we now handle "home" button found on some devices with Goodix
    touchscreens

    - assorted other fixups"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: i8042 - add Gigabyte P57 to the keyboard reset table
    Input: xpad - validate USB endpoint type during probe
    Input: ucb1400_ts - fix suspend and resume handling
    Input: edt-ft5x06 - fix access to non-existing register
    Input: elantech - make arrays debounce_packet static, reduces object code size
    Input: surface3_spi - make const array header static, reduces object code size
    Input: goodix - add support for capacitive home button
    Input: add a driver for PWM controllable vibrators
    Input: adi - make array seq static, reduces object code size

    Linus Torvalds
     
  • Commit 5620a0d1aac ("firmware: delete in-kernel firmware") removed the
    entire firmware directory. Unfortunately it thereby also removed the
    support for built-in firmware.

    This restores the ability to build firmware directly into the kernel by
    pruning the original Makefile to the necessary minimum. The default for
    EXTRA_FIRMWARE_DIR is now the standard directory /lib/firmware/.

    Fixes: 5620a0d1aac ("firmware: delete in-kernel firmware")
    Signed-off-by: Markus Trippelsdorf
    Acked-by: Greg K-H
    Signed-off-by: Linus Torvalds

    Markus Trippelsdorf
     
  • The driver doesn't support events from address families other than IPv4
    and IPv6, so ignore them. Otherwise, we risk queueing a work item before
    it's initialized.

    This can happen in case a VRF is configured when MROUTE_MULTIPLE_TABLES
    is enabled, as the VRF driver will try to add an l3mdev rule for the
    IPMR family.

    Fixes: 65e65ec137f4 ("mlxsw: spectrum_router: Don't ignore IPv6 notifications")
    Signed-off-by: Ido Schimmel
    Reported-by: Andreas Rammhold
    Reported-by: Florian Klink
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Fix link in filter.txt.

    Acked-by: Pavel Machek

    Signed-off-by: David S. Miller

    Pavel Machek
     
  • Now skb->mstamp_skb is updated later, we also need to call
    tcp_rate_skb_sent() after the update is done.

    Fixes: 8c72c65b426b ("tcp: update skb->skb_mstamp more carefully")
    Signed-off-by: Eric Dumazet
    Acked-by: Soheil Hassas Yeganeh
    Signed-off-by: David S. Miller

    Eric Dumazet
     

16 Sep, 2017

21 commits

  • Pull MIPS updates from Ralf Baechle:
    "This is the main pull request for 4.14 for MIPS; below a summary of
    the non-merge commits:

    CM:
    - Rename mips_cm_base to mips_gcr_base
    - Specify register size when generating accessors
    - Use BIT/GENMASK for register fields, order & drop shifts
    - Add cluster & block args to mips_cm_lock_other()

    CPC:
    - Use common CPS accessor generation macros
    - Use BIT/GENMASK for register fields, order & drop shifts
    - Introduce register modify (set/clear/change) accessors
    - Use change_*, set_* & clear_* where appropriate
    - Add CM/CPC 3.5 register definitions
    - Use GlobalNumber macros rather than magic numbers
    - Have asm/mips-cps.h include CM & CPC headers
    - Cluster support for topology functions
    - Detect CPUs in secondary clusters

    CPS:
    - Read GIC_VL_IDENT directly, not via irqchip driver

    DMA:
    - Consolidate coherent and non-coherent dma_alloc code
    - Don't use dma_cache_sync to implement fd_cacheflush

    FPU emulation / FP assist code:
    - Another series of 14 commits fixing corner cases such as NaN
    propgagation and other special input values.
    - Zero bits 32-63 of the result for a CLASS.D instruction.
    - Enhanced statics via debugfs
    - Do not use bools for arithmetic. GCC 7.1 moans about this.
    - Correct user fault_addr type

    Generic MIPS:
    - Enhancement of stack backtraces
    - Cleanup from non-existing options
    - Handle non word sized instructions when examining frame
    - Fix detection and decoding of ADDIUSP instruction
    - Fix decoding of SWSP16 instruction
    - Refactor handling of stack pointer in get_frame_info
    - Remove unreachable code from force_fcr31_sig()
    - Convert to using %pOF instead of full_name
    - Remove the R6000 support.
    - Move FP code from *_switch.S to *_fpu.S
    - Remove unused ST_OFF from r2300_switch.S
    - Allow platform to specify multiple its.S files
    - Add #includes to various files to ensure code builds reliable and
    without warning..
    - Remove __invalidate_kernel_vmap_range
    - Remove plat_timer_setup
    - Declare various variables & functions static
    - Abstract CPU core & VP(E) ID access through accessor functions
    - Store core & VP IDs in GlobalNumber-style variable
    - Unify checks for sibling CPUs
    - Add CPU cluster number accessors
    - Prevent direct use of generic_defconfig
    - Make CONFIG_MIPS_MT_SMP default y
    - Add __ioread64_copy
    - Remove unnecessary inclusions of linux/irqchip/mips-gic.h

    GIC:
    - Introduce asm/mips-gic.h with accessor functions
    - Use new GIC accessor functions in mips-gic-timer
    - Remove counter access functions from irq-mips-gic.c
    - Remove gic_read_local_vp_id() from irq-mips-gic.c
    - Simplify shared interrupt pending/mask reads in irq-mips-gic.c
    - Simplify gic_local_irq_domain_map() in irq-mips-gic.c
    - Drop gic_(re)set_mask() functions in irq-mips-gic.c
    - Remove gic_set_polarity(), gic_set_trigger(), gic_set_dual_edge(),
    gic_map_to_pin() and gic_map_to_vpe() from irq-mips-gic.c.
    - Convert remaining shared reg access, local int mask access and
    remaining local reg access to new accessors
    - Move GIC_LOCAL_INT_* to asm/mips-gic.h
    - Remove GIC_CPU_INT* macros from irq-mips-gic.c
    - Move various definitions to the driver
    - Remove gic_get_usm_range()
    - Remove __gic_irq_dispatch() forward declaration
    - Remove gic_init()
    - Use mips_gic_present() in place of gic_present and remove
    gic_present
    - Move gic_get_c0_*_int() to asm/mips-gic.h
    - Remove linux/irqchip/mips-gic.h
    - Inline __gic_init()
    - Inline gic_basic_init()
    - Make pcpu_masks a per-cpu variable
    - Use pcpu_masks to avoid reading GIC_SH_MASK*
    - Clean up mti, reserved-cpu-vectors handling
    - Use cpumask_first_and() in gic_set_affinity()
    - Let the core set struct irq_common_data affinity

    microMIPS:
    - Fix microMIPS stack unwinding on big endian systems

    MIPS-GIC:
    - SYNC after enabling GIC region

    NUMA:
    - Remove the unused parent_node() macro

    R6:
    - Constify r2_decoder_tables
    - Add accessor & bit definitions for GlobalNumber

    SMP:
    - Constify smp ops
    - Allow boot_secondary SMP op to return errors

    VDSO:
    - Drop gic_get_usm_range() usage
    - Avoid use of linux/irqchip/mips-gic.h

    Platform changes:

    Alchemy:
    - Add devboard machine type to cpuinfo
    - update cpu feature overrides
    - Threaded carddetect irqs for devboards

    AR7:
    - allow NULL clock for clk_get_rate

    BCM63xx:
    - Fix ENETDMA_6345_MAXBURST_REG offset
    - Allow NULL clock for clk_get_rate

    CI20:
    - Enable GPIO and RTC drivers in defconfig
    - Add ethernet and fixed-regulator nodes to DTS

    Generic platform:
    - Move Boston and NI 169445 FIT image source to their own files
    - Include asm/bootinfo.h for plat_fdt_relocated()
    - Include asm/time.h for get_c0_*_int()
    - Include asm/bootinfo.h for plat_fdt_relocated()
    - Include asm/time.h for get_c0_*_int()
    - Allow filtering enabled boards by requirements
    - Don't explicitly disable CONFIG_USB_SUPPORT
    - Bump default NR_CPUS to 16

    JZ4700:
    - Probe the jz4740-rtc driver from devicetree

    Lantiq:
    - Drop check of boot select from the spi-falcon driver.
    - Drop check of boot select from the lantiq-flash MTD driver.
    - Access boot cause register in the watchdog driver through regmap
    - Add device tree binding documentation for the watchdog driver
    - Add docs for the RCU DT bindings.
    - Convert the fpi bus driver to a platform_driver
    - Remove ltq_reset_cause() and ltq_boot_select(
    - Switch to a proper reset driver
    - Switch to a new drivers/soc GPHY driver
    - Add an USB PHY driver for the Lantiq SoCs using the RCU module
    - Use of_platform_default_populate instead of __dt_register_buses
    - Enable MFD_SYSCON to be able to use it for the RCU MFD
    - Replace ltq_boot_select() with dummy implementation.

    Loongson 2F:
    - Allow NULL clock for clk_get_rate

    Malta:
    - Use new GIC accessor functions

    NI 169445:
    - Add support for NI 169445 board.
    - Only include in 32r2el kernels

    Octeon:
    - Add support for watchdog of 78XX SOCs.
    - Add support for watchdog of CN68XX SOCs.
    - Expose support for mips32r1, mips32r2 and mips64r1
    - Enable more drivers in config file
    - Add support for accessing the boot vector.
    - Remove old boot vector code from watchdog driver
    - Define watchdog registers for 70xx, 73xx, 78xx, F75xx.
    - Make CSR functions node aware.
    - Allow access to CIU3 IRQ domains.
    - Misc cleanups in the watchdog driver

    Omega2+:
    - New board, add support and defconfig

    Pistachio:
    - Enable Root FS on NFS in defconfig

    Ralink:
    - Add Mediatek MT7628A SoC
    - Allow NULL clock for clk_get_rate
    - Explicitly request exclusive reset control in the pci-mt7620 PCI driver.

    SEAD3:
    - Only include in 32 bit kernels by default

    VoCore:
    - Add VoCore as a vendor t0 dt-bindings
    - Add defconfig file"

    * '4.14-features' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (167 commits)
    MIPS: Refactor handling of stack pointer in get_frame_info
    MIPS: Stacktrace: Fix microMIPS stack unwinding on big endian systems
    MIPS: microMIPS: Fix decoding of swsp16 instruction
    MIPS: microMIPS: Fix decoding of addiusp instruction
    MIPS: microMIPS: Fix detection of addiusp instruction
    MIPS: Handle non word sized instructions when examining frame
    MIPS: ralink: allow NULL clock for clk_get_rate
    MIPS: Loongson 2F: allow NULL clock for clk_get_rate
    MIPS: BCM63XX: allow NULL clock for clk_get_rate
    MIPS: AR7: allow NULL clock for clk_get_rate
    MIPS: BCM63XX: fix ENETDMA_6345_MAXBURST_REG offset
    mips: Save all registers when saving the frame
    MIPS: Add DWARF unwinding to assembly
    MIPS: Make SAVE_SOME more standard
    MIPS: Fix issues in backtraces
    MIPS: jz4780: DTS: Probe the jz4740-rtc driver from devicetree
    MIPS: Ci20: Enable RTC driver
    watchdog: octeon-wdt: Add support for 78XX SOCs.
    watchdog: octeon-wdt: Add support for cn68XX SOCs.
    watchdog: octeon-wdt: File cleaning.
    ...

    Linus Torvalds
     
  • Pull PCI fix from Bjorn Helgaas:
    "Revert an attempt to fix a race while enabling upstream bridges
    because it broke iwlwifi firmware loading"

    * tag 'pci-v4.14-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
    Revert "PCI: Avoid race while enabling upstream bridges"

    Linus Torvalds
     
  • Pull drm AMD fixes from Dave Airlie:
    "Just had a single AMD fixes pull from Alex for rc1"

    * tag 'drm-fixes-for-v4.14-rc1' of git://people.freedesktop.org/~airlied/linux:
    drm/amdgpu: revert "fix deadlock of reservation between cs and gpu reset v2"
    drm/amdgpu: remove duplicate return statement
    drm/amdgpu: check memory allocation failure
    drm/amd/amdgpu: fix BANK_SELECT on Vega10 (v2)
    drm/amdgpu: inline amdgpu_ttm_do_bind again
    drm/amdgpu: fix amdgpu_ttm_bind
    drm/amdgpu: remove the GART copy hack
    drm/ttm:fix wrong decoding of bo_count
    drm/ttm: fix missing inc bo_count
    drm/amdgpu: set sched_hw_submission higher for KIQ (v3)
    drm/amdgpu: move default gart size setting into gmc modules
    drm/amdgpu: refine default gart size
    drm/amd/powerplay: ACG frequency added in PPTable
    drm/amdgpu: discard commands of killed processes
    drm/amdgpu: fix and cleanup shadow handling
    drm/amdgpu: add automatic per asic settings for gart_size
    drm/amdgpu/gfx8: fix spelling typo in mqd allocation
    drm/amd/powerplay: unhalt mec after loading
    drm/amdgpu/virtual_dce: Virtual display doesn't support disable vblank immediately
    drm/amdgpu: Fix huge page updates with CPU

    Linus Torvalds
     
  • Pull more i2c updates from Wolfram Sang:
    "I2C has two more new drivers: Altera FPGA and STM32F7"

    * 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
    i2c: i2c-stm32f7: add driver
    i2c: i2c-stm32f4: use generic definition of speed enum
    dt-bindings: i2c-stm32: Document the STM32F7 I2C bindings
    i2c: altera: Add Altera I2C Controller driver
    dt-bindings: i2c: Add Altera I2C Controller

    Linus Torvalds
     
  • Pull more KVM updates from Paolo Bonzini:
    - PPC bugfixes
    - RCU splat fix
    - swait races fix
    - pointless userspace-triggerable BUG() fix
    - misc fixes for KVM_RUN corner cases
    - nested virt correctness fixes + one host DoS
    - some cleanups
    - clang build fix
    - fix AMD AVIC with default QEMU command line options
    - x86 bugfixes

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
    kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly
    kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly
    kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry
    kvm,mips: Fix potential swait_active() races
    kvm,powerpc: Serialize wq active checks in ops->vcpu_kick
    kvm: Serialize wq active checks in kvm_vcpu_wake_up()
    kvm,x86: Fix apf_task_wake_one() wq serialization
    kvm,lapic: Justify use of swait_active()
    kvm,async_pf: Use swq_has_sleeper()
    sched/wait: Add swq_has_sleeper()
    KVM: VMX: Do not BUG() on out-of-bounds guest IRQ
    KVM: Don't accept obviously wrong gsi values via KVM_IRQFD
    kvm: nVMX: Don't allow L2 to access the hardware CR8
    KVM: trace events: update list of exit reasons
    KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously
    KVM: X86: Don't block vCPU if there is pending exception
    KVM: SVM: Add irqchip_split() checks before enabling AVIC
    KVM: Add struct kvm_vcpu pointer parameter to get_enable_apicv()
    KVM: SVM: Refactor AVIC vcpu initialization into avic_init_vcpu()
    KVM: x86: fix clang build
    ...

    Linus Torvalds
     
  • Neither ___bpf_prog_run nor the JITs accept it.
    Also adds a new test case.

    Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)")
    Signed-off-by: Edward Cree
    Acked-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Edward Cree
     
  • sctp_diag would not actually dump out sk/asoc if inet_sctp_diag_fill
    returns err, in which case it shouldn't mark sk dumped by setting
    cb->args[3] as 1 in sctp_sock_dump().

    Otherwise, it could cause some asocs to have no parent's sk dumped
    in 'ss --sctp'.

    So this patch is to not set cb->args[3] when inet_sctp_diag_fill()
    returns err in sctp_sock_dump().

    Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file")
    Signed-off-by: Xin Long
    Acked-by: Marcelo Ricardo Leitner
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Xin Long
     
  • Commit 86fdb3448cc1 ("sctp: ensure ep is not destroyed before doing the
    dump") tried to fix an use-after-free issue by checking !sctp_sk(sk)->ep
    with holding sock and sock lock.

    But Paolo noticed that endpoint could be destroyed in sctp_rcv without
    sock lock protection. It means the use-after-free issue still could be
    triggered when sctp_rcv put and destroy ep after sctp_sock_dump checks
    !ep, although it's pretty hard to reproduce.

    I could reproduce it by mdelay in sctp_rcv while msleep in sctp_close
    and sctp_sock_dump long time.

    This patch is to add another param cb_done to sctp_for_each_transport
    and dump ep->assocs with holding tsp after jumping out of transport's
    traversal in it to avoid this issue.

    It can also improve sctp diag dump to make it run faster, as no need
    to save sk into cb->args[5] and keep calling sctp_for_each_transport
    any more.

    This patch is also to use int * instead of int for the pos argument
    in sctp_for_each_transport, which could make postion increment only
    in sctp_for_each_transport and no need to keep changing cb->args[2]
    in sctp_sock_filter and sctp_sock_dump any more.

    Fixes: 86fdb3448cc1 ("sctp: ensure ep is not destroyed before doing the dump")
    Reported-by: Paolo Abeni
    Signed-off-by: Xin Long
    Acked-by: Marcelo Ricardo Leitner
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Xin Long
     
  • The default receive buffer size was reduced by recent change
    to a value which was appropriate for 10G and Windows Server 2016.
    But the value is too small for full performance with 40G on Azure.
    Increase the default back to maximum supported by host.

    Fixes: 8b5327975ae1 ("netvsc: allow controlling send/recv buffer size")
    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • liujian reported a problem in TCP_USER_TIMEOUT processing with a patch
    in tcp_probe_timer() :
    https://www.spinics.net/lists/netdev/msg454496.html

    After investigations, the root cause of the problem is that we update
    skb->skb_mstamp of skbs in write queue, even if the attempt to send a
    clone or copy of it failed. One reason being a routing problem.

    This patch prevents this, solving liujian issue.

    It also removes a potential RTT miscalculation, since
    __tcp_retransmit_skb() is not OR-ing TCP_SKB_CB(skb)->sacked with
    TCPCB_EVER_RETRANS if a failure happens, but skb->skb_mstamp has
    been changed.

    A future ACK would then lead to a very small RTT sample and min_rtt
    would then be lowered to this too small value.

    Tested:

    # cat user_timeout.pkt
    --local_ip=192.168.102.64

    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
    +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
    +0 bind(3, ..., ...) = 0
    +0 listen(3, 1) = 0

    +0 `ifconfig tun0 192.168.102.64/16; ip ro add 192.0.2.1 dev tun0`

    +0 < S 0:0(0) win 0
    +0 > S. 0:0(0) ack 1

    +.1 < . 1:1(0) ack 1 win 65530
    +0 accept(3, ..., ...) = 4

    +0 setsockopt(4, SOL_TCP, TCP_USER_TIMEOUT, [3000], 4) = 0
    +0 write(4, ..., 24) = 24
    +0 > P. 1:25(24) ack 1 win 29200
    +.1 < . 1:1(0) ack 25 win 65530

    //change the ipaddress
    +1 `ifconfig tun0 192.168.0.10/16`

    +1 write(4, ..., 24) = 24
    +1 write(4, ..., 24) = 24
    +1 write(4, ..., 24) = 24
    +1 write(4, ..., 24) = 24

    +0 `ifconfig tun0 192.168.102.64/16`
    +0 < . 1:2(1) ack 25 win 65530
    +0 `ifconfig tun0 192.168.0.10/16`

    +3 write(4, ..., 24) = -1

    # ./packetdrill user_timeout.pkt

    Signed-off-by: Eric Dumazet
    Reported-by: liujian
    Acked-by: Neal Cardwell
    Acked-by: Yuchung Cheng
    Acked-by: Soheil Hassas Yeganeh
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • rt_iif is only set to the actual egress device for the output path. The
    recent change to consider the l3slave flag when returning IP_PKTINFO
    works for local traffic (the correct device index is returned), but it
    broke the more typical use case of packets received from a remote host
    always returning the VRF index rather than the original ingress device.
    Update the fixup to consider l3slave and rt_iif actually getting set.

    Fixes: 1dfa76390bf05 ("net: ipv4: add check for l3slave for index returned in IP_PKTINFO")
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • If the network interface is kept running during suspend, the net core
    may call net_device_ops.ndo_start_xmit() while the Ethernet device is
    still suspended, which may lead to a system crash.

    E.g. on sh73a0/kzm9g and r8a73a4/ape6evm, the external Ethernet chip is
    driven by a PM controlled clock. If the Ethernet registers are accessed
    while the clock is not running, the system will crash with an imprecise
    external abort.

    As this is a race condition with a small time window, it is not so easy
    to trigger at will. Using pm_test may increase your chances:

    # echo 0 > /sys/module/printk/parameters/console_suspend
    # echo platform > /sys/power/pm_test
    # echo mem > /sys/power/state

    To fix this, make sure the network interface is quietened during
    suspend.

    Signed-off-by: Geert Uytterhoeven
    Reviewed-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Geert Uytterhoeven
     
  • We can enter a deadlock situation because there is no sufficient protection
    when ndo_get_stats64() runs in process context to guard against RX or TX NAPI
    contexts running in softirq, this can lead to the following lockdep splat and
    actual deadlock was experienced as well with an iperf session in the background
    and a while loop doing ifconfig + ethtool.

    [ 5.780350] ================================
    [ 5.784679] WARNING: inconsistent lock state
    [ 5.789011] 4.13.0-rc7-02179-g32fae27c725d #70 Not tainted
    [ 5.794561] --------------------------------
    [ 5.798890] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
    [ 5.804971] swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
    [ 5.810175] (&syncp->seq#2){+.?...}, at: [] bcm_sysport_tx_reclaim+0x30/0x54
    [ 5.818327] {SOFTIRQ-ON-W} state was registered at:
    [ 5.823278] bcm_sysport_get_stats64+0x17c/0x258
    [ 5.828053] dev_get_stats+0x38/0xac
    [ 5.831776] rtnl_fill_stats+0x30/0x118
    [ 5.835761] rtnl_fill_ifinfo+0x538/0xe24
    [ 5.839921] rtmsg_ifinfo_build_skb+0x6c/0xd8
    [ 5.844430] rtmsg_ifinfo_event.part.5+0x14/0x44
    [ 5.849201] rtmsg_ifinfo+0x20/0x28
    [ 5.852837] register_netdevice+0x628/0x6b8
    [ 5.857171] register_netdev+0x14/0x24
    [ 5.861051] bcm_sysport_probe+0x30c/0x438
    [ 5.865280] platform_drv_probe+0x50/0xb0
    [ 5.869418] driver_probe_device+0x2e8/0x450
    [ 5.873817] __driver_attach+0x104/0x120
    [ 5.877871] bus_for_each_dev+0x7c/0xc0
    [ 5.881834] bus_add_driver+0x1b0/0x270
    [ 5.885797] driver_register+0x78/0xf4
    [ 5.889675] do_one_initcall+0x54/0x190
    [ 5.893646] kernel_init_freeable+0x144/0x1d0
    [ 5.898135] kernel_init+0x8/0x110
    [ 5.901665] ret_from_fork+0x14/0x2c
    [ 5.905363] irq event stamp: 24263
    [ 5.908804] hardirqs last enabled at (24262): [] net_rx_action+0xc4/0x4e4
    [ 5.916624] hardirqs last disabled at (24263): [] _raw_spin_lock_irqsave+0x1c/0x98
    [ 5.925143] softirqs last enabled at (24258): [] irq_enter+0x84/0x98
    [ 5.932524] softirqs last disabled at (24259): [] irq_exit+0x108/0x16c
    [ 5.939985]
    [ 5.939985] other info that might help us debug this:
    [ 5.946576] Possible unsafe locking scenario:
    [ 5.946576]
    [ 5.952556] CPU0
    [ 5.955031] ----
    [ 5.957506] lock(&syncp->seq#2);
    [ 5.960955]
    [ 5.963604] lock(&syncp->seq#2);
    [ 5.967227]
    [ 5.967227] *** DEADLOCK ***
    [ 5.967227]
    [ 5.973222] 1 lock held by swapper/0/0:
    [ 5.977092] #0: (&(&ring->lock)->rlock){..-...}, at: [] bcm_sysport_tx_reclaim+0x20/0x54

    So just remove the u64_stats_update_begin()/end() pair in ndo_get_stats64()
    since it does not appear to be useful for anything. No inconsistency was
    observed with either ifconfig or ethtool, global TX counts equal the sum of
    per-queue TX counts on a 32-bit architecture.

    Fixes: 10377ba7673d ("net: systemport: Support 64bit statistics")
    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • When building an allmodconfig kernel with gcc-4.6, we get a rather
    odd warning:

    drivers/net/vrf.c: In function ‘vrf_ip6_input_dst’:
    drivers/net/vrf.c:964:3: error: initialized field with side-effects overwritten [-Werror]
    drivers/net/vrf.c:964:3: error: (near initialization for ‘fl6’) [-Werror]

    I have no idea what this warning is even trying to say, but it does
    seem like a false positive. Reordering the initialization in to match
    the structure definition gets rid of the warning, and might also avoid
    whatever gcc thinks is wrong here.

    Fixes: 9ff74384600a ("net: vrf: Handle ipv6 multicast and link-local addresses")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Arnd Bergmann
     
  • call to memset to assign 0 value immediately after allocating
    memory with kzalloc is unnecesaary as kzalloc allocates the memory
    filled with 0 value.

    Semantic patch used to resolve this issue:

    @@
    expression e,e2; constant c;
    statement S;
    @@

    e = kzalloc(e2, c);
    if(e == NULL) S
    - memset(e, 0, e2);

    Signed-off-by: Himanshu Jha
    Signed-off-by: Himanshu Jha
    Acked-by: Sudarsana Kalluru
    Signed-off-by: David S. Miller

    Himanshu Jha
     
  • …el/git/gregkh/driver-core

    Pull firmware removal from Greg KH:
    "Many many years ago (at the kernel summit in Boston), we all came to
    the agreement that the firmware/ tree should be dropped from the
    kernel, and everyone use the linux-firmware package instead. For some
    minor reason, David Woodhouse didn't send the pull request at that
    point in time, and everyone forgot about this.

    The topic came up in the hallway track at the Plumbers conference this
    week, so here's a single patch that drops the whole firmware tree. The
    last firmware update was back in 2013, and all distros have been using
    linux-firmware instead since at least that year, if not before. The
    only commits to that directory since 2013 was some kbuild fixups for
    various build tool issues.

    So lets finally drop this, we don't need to lug them around in the
    kernel source tree anymore, especially as no one wants or uses them.

    This has passed build testing with 0-day, I don't think it made it
    into linux-next this week, but I figured it was good to get in before
    4.14-rc1 was out"

    * tag 'firmware_removal-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
    firmware: delete in-kernel firmware

    Linus Torvalds
     
  • Pull arch/nios2 update from Ley Foon Tan.

    * tag 'nios2-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
    nios2: time: Read timer in get_cycles only if initialized
    nios2: add earlycon support to 3c120 devboard DTS

    Linus Torvalds
     
  • Pull powerpc fix from Michael Ellerman:
    "Just one fix, for the handling of alignment interrupts on dcbz
    instructions.

    Thanks to Paul Mackerras, Christian Zigotzky, Michal Sojka"

    * tag 'powerpc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    powerpc: Fix handling of alignment interrupt on dcbz instruction

    Linus Torvalds
     
  • Pull orangefs updates from Mike Marshall:
    "Some cleanups and a big bug fix for ACLs.

    When I was reviewing Jan Kara's ACL patch, I realized that Orangefs
    ACL code was busted, not just in the kernel module, but in the server
    as well. I've been working on the code in the server mostly, but
    here's one kernel patch, there will be more"

    * tag 'for-linus-4.14-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
    orangefs: Adjust three checks for null pointers
    orangefs: Use kcalloc() in orangefs_prepare_cdm_array()
    orangefs: Delete error messages for a failed memory allocation in five functions
    orangefs: constify xattr_handler structure
    orangefs: don't call filemap_write_and_wait from fsync
    orangefs: off by ones in xattr size checks
    orangefs: documentation clean up
    orangefs: react properly to posix_acl_update_mode's aftermath.
    orangefs: Don't clear SGID when inheriting ACLs

    Linus Torvalds
     
  • Prepare second round of input updates for 4.14 merge window.

    Dmitry Torokhov
     
  • Similar to other Gigabyte laptops, the touchpad on P57 requires a
    keyboard reset to detect Elantech touchpad correctly.

    BugLink: https://bugs.launchpad.net/bugs/1594214
    Signed-off-by: Kai-Heng Feng
    Cc: stable@vger.kernel.org
    Signed-off-by: Dmitry Torokhov

    Kai-Heng Feng
     

15 Sep, 2017

1 commit

  • When emulating a nested VM-entry from L1 to L2, several control field
    validation checks are deferred to the hardware. Should one of these
    validation checks fail, vcpu_vmx_run will set the vmx->fail flag. When
    this happens, the L2 guest state is not loaded (even in part), and
    execution should continue in L1 with the next instruction after the
    VMLAUNCH/VMRESUME.

    The VMCS12 is not modified (except for the VM-instruction error
    field), the VMCS12 MSR save/load lists are not processed, and the CPU
    state is not loaded from the VMCS12 host area. Moreover, the vmcs02
    exit reason is stale, so it should not be consulted for any reason.

    Signed-off-by: Jim Mattson
    Signed-off-by: Paolo Bonzini

    Jim Mattson