09 May, 2020

1 commit

  • To ARM64, the alloc/free path is not aligned. So the free path
    is not correct currently. It was designed for X86.

    This patch is not a mature fix, just a temp workaround to avoid
    ARM64 dom0 panic.

    Acked-by: Alice Guo
    Signed-off-by: Peng Fan
    (cherry picked from commit bf7843fbbb4ea0f8590f00d255e16c0d05c57873)

    Peng Fan
     

29 Feb, 2020

1 commit

  • commit 8645e56a4ad6dcbf504872db7f14a2f67db88ef2 upstream.

    xen_maybe_preempt_hcall() is called from the exception entry point
    xen_do_hypervisor_callback with interrupts disabled.

    _cond_resched() evades the might_sleep() check in cond_resched() which
    would have caught that and schedule_debug() unfortunately lacks a check
    for irqs_disabled().

    Enable interrupts around the call and use cond_resched() to catch future
    issues.

    Fixes: fdfd811ddde3 ("x86/xen: allow privcmd hypercalls to be preempted")
    Signed-off-by: Thomas Gleixner
    Link: https://lore.kernel.org/r/878skypjrh.fsf@nanos.tec.linutronix.de
    Reviewed-by: Juergen Gross
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

11 Feb, 2020

1 commit

  • commit eda4eabf86fd6806eaabc23fb90dd056fdac037b upstream.

    Commit 3aa6c19d2f38be ("xen/balloon: Support xend-based toolstack")
    tried to fix a regression with running on rather ancient Xen versions.
    Unfortunately the fix was based on the assumption that xend would
    just use another Xenstore node, but in reality only some downstream
    versions of xend are doing that. The upstream xend does not write
    that Xenstore node at all, so the problem must be fixed in another
    way.

    The easiest way to achieve that is to fall back to the behavior
    before commit 96edd61dcf4436 ("xen/balloon: don't online new memory
    initially") in case the static memory maximum can't be read.

    This is achieved by setting static_max to the current number of
    memory pages known by the system resulting in target_diff becoming
    zero.

    Fixes: 3aa6c19d2f38be ("xen/balloon: Support xend-based toolstack")
    Signed-off-by: Juergen Gross
    Reviewed-by: Boris Ostrovsky
    Cc: # 4.13
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Greg Kroah-Hartman

    Juergen Gross
     

09 Jan, 2020

1 commit

  • [ Upstream commit c673ec61ade89bf2f417960f986bc25671762efb ]

    When CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is not defined
    reserve_additional_memory() will set balloon_stats.target_pages to a
    wrong value in case there are still some ballooned pages allocated via
    alloc_xenballooned_pages().

    This will result in balloon_process() no longer be triggered when
    ballooned pages are freed in batches.

    Reported-by: Nicholas Tsirakis
    Signed-off-by: Juergen Gross
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Juergen Gross
    Signed-off-by: Sasha Levin

    Juergen Gross
     

31 Dec, 2019

1 commit

  • [ Upstream commit fa6614d8ef13c63aac52ad7c07c5e69ce4aba3dd ]

    DMA_SHARED_BUFFER can not be enabled by the user (it represents a library
    set in the kernel). The kconfig convention is to use select for such
    symbols so they are turned on implicitly when the user enables a kconfig
    that needs them.

    Otherwise the XEN_GNTDEV_DMABUF kconfig is overly difficult to enable.

    Fixes: 932d6562179e ("xen/gntdev: Add initial support for dma-buf UAPI")
    Cc: Oleksandr Andrushchenko
    Cc: Boris Ostrovsky
    Cc: xen-devel@lists.xenproject.org
    Cc: Juergen Gross
    Cc: Stefano Stabellini
    Reviewed-by: Juergen Gross
    Reviewed-by: Oleksandr Andrushchenko
    Signed-off-by: Jason Gunthorpe
    Signed-off-by: Juergen Gross
    Signed-off-by: Sasha Levin

    Jason Gunthorpe
     

20 Oct, 2019

1 commit

  • Pull networking fixes from David Miller:
    "I was battling a cold after some recent trips, so quite a bit piled up
    meanwhile, sorry about that.

    Highlights:

    1) Fix fd leak in various bpf selftests, from Brian Vazquez.

    2) Fix crash in xsk when device doesn't support some methods, from
    Magnus Karlsson.

    3) Fix various leaks and use-after-free in rxrpc, from David Howells.

    4) Fix several SKB leaks due to confusion of who owns an SKB and who
    should release it in the llc code. From Eric Biggers.

    5) Kill a bunc of KCSAN warnings in TCP, from Eric Dumazet.

    6) Jumbo packets don't work after resume on r8169, as the BIOS resets
    the chip into non-jumbo mode during suspend. From Heiner Kallweit.

    7) Corrupt L2 header during MPLS push, from Davide Caratti.

    8) Prevent possible infinite loop in tc_ctl_action, from Eric
    Dumazet.

    9) Get register bits right in bcmgenet driver, based upon chip
    version. From Florian Fainelli.

    10) Fix mutex problems in microchip DSA driver, from Marek Vasut.

    11) Cure race between route lookup and invalidation in ipv4, from Wei
    Wang.

    12) Fix performance regression due to false sharing in 'net'
    structure, from Eric Dumazet"

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (145 commits)
    net: reorder 'struct net' fields to avoid false sharing
    net: dsa: fix switch tree list
    net: ethernet: dwmac-sun8i: show message only when switching to promisc
    net: aquantia: add an error handling in aq_nic_set_multicast_list
    net: netem: correct the parent's backlog when corrupted packet was dropped
    net: netem: fix error path for corrupted GSO frames
    macb: propagate errors when getting optional clocks
    xen/netback: fix error path of xenvif_connect_data()
    net: hns3: fix mis-counting IRQ vector numbers issue
    net: usb: lan78xx: Connect PHY before registering MAC
    vsock/virtio: discard packets if credit is not respected
    vsock/virtio: send a credit update when buffer size is changed
    mlxsw: spectrum_trap: Push Ethernet header before reporting trap
    net: ensure correct skb->tstamp in various fragmenters
    net: bcmgenet: reset 40nm EPHY on energy detect
    net: bcmgenet: soft reset 40nm EPHYs before MAC init
    net: phy: bcm7xxx: define soft_reset for 40nm EPHY
    net: bcmgenet: don't set phydev->link from MAC
    net: Update address for MediaTek ethernet driver in MAINTAINERS
    ipv4: fix race condition between route lookup and invalidation
    ...

    Linus Torvalds
     

13 Oct, 2019

1 commit


10 Oct, 2019

3 commits

  • As the removed comments say, these aren't DT based devices.
    of_dma_configure() is going to stop allowing a NULL DT node and calling
    it will no longer work.

    The comment is also now out of date as of commit 9ab91e7c5c51 ("arm64:
    default to the direct mapping in get_arch_dma_ops"). Direct mapping
    is now the default rather than dma_dummy_ops.

    According to Stefano and Oleksandr, the only other part needed is
    setting the DMA masks and there's no reason to restrict the masks to
    32-bits. So set the masks to 64 bits.

    Cc: Robin Murphy
    Cc: Julien Grall
    Cc: Nicolas Saenz Julienne
    Cc: Oleksandr Andrushchenko
    Cc: Boris Ostrovsky
    Cc: Juergen Gross
    Cc: Stefano Stabellini
    Cc: Christoph Hellwig
    Cc: xen-devel@lists.xenproject.org
    Signed-off-by: Rob Herring
    Acked-by: Oleksandr Andrushchenko
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Rob Herring
     
  • xen_auto_xlat_grant_frames.vaddr is definitely NULL in this case.
    So the address printing is unnecessary.

    Signed-off-by: Fuqian Huang
    Reviewed-by: Juergen Gross
    Signed-off-by: Boris Ostrovsky

    Fuqian Huang
     
  • reqsk_queue_empty() is called from inet_csk_listen_poll() while
    other cpus might write ->rskq_accept_head value.

    Use {READ|WRITE}_ONCE() to avoid compiler tricks
    and potential KCSAN splats.

    Fixes: fff1f3001cc5 ("tcp: add a spinlock to protect struct request_sock_queue")
    Signed-off-by: Eric Dumazet
    Signed-off-by: Jakub Kicinski

    Eric Dumazet
     

05 Oct, 2019

1 commit

  • Pull xen fixes and cleanups from Juergen Gross:

    - a fix in the Xen balloon driver avoiding hitting a BUG_ON() in some
    cases, plus a follow-on cleanup series for that driver

    - a patch for introducing non-blocking EFI callbacks in Xen's EFI
    driver, plu a cleanup patch for Xen EFI handling merging the x86 and
    ARM arch specific initialization into the Xen EFI driver

    - a fix of the Xen xenbus driver avoiding a self-deadlock when cleaning
    up after a user process has died

    - a fix for Xen on ARM after removal of ZONE_DMA

    - a cleanup patch for avoiding build warnings for Xen on ARM

    * tag 'for-linus-5.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen/xenbus: fix self-deadlock after killing user process
    xen/efi: have a common runtime setup function
    arm: xen: mm: use __GPF_DMA32 for arm64
    xen/balloon: Clear PG_offline in balloon_retrieve()
    xen/balloon: Mark pages PG_offline in balloon_append()
    xen/balloon: Drop __balloon_append()
    xen/balloon: Set pages PageOffline() in balloon_add_region()
    ARM: xen: unexport HYPERVISOR_platform_op function
    xen/efi: Set nonblocking callbacks

    Linus Torvalds
     

03 Oct, 2019

1 commit

  • In case a user process using xenbus has open transactions and is killed
    e.g. via ctrl-C the following cleanup of the allocated resources might
    result in a deadlock due to trying to end a transaction in the xenbus
    worker thread:

    [ 2551.474706] INFO: task xenbus:37 blocked for more than 120 seconds.
    [ 2551.492215] Tainted: P OE 5.0.0-29-generic #5
    [ 2551.510263] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [ 2551.528585] xenbus D 0 37 2 0x80000080
    [ 2551.528590] Call Trace:
    [ 2551.528603] __schedule+0x2c0/0x870
    [ 2551.528606] ? _cond_resched+0x19/0x40
    [ 2551.528632] schedule+0x2c/0x70
    [ 2551.528637] xs_talkv+0x1ec/0x2b0
    [ 2551.528642] ? wait_woken+0x80/0x80
    [ 2551.528645] xs_single+0x53/0x80
    [ 2551.528648] xenbus_transaction_end+0x3b/0x70
    [ 2551.528651] xenbus_file_free+0x5a/0x160
    [ 2551.528654] xenbus_dev_queue_reply+0xc4/0x220
    [ 2551.528657] xenbus_thread+0x7de/0x880
    [ 2551.528660] ? wait_woken+0x80/0x80
    [ 2551.528665] kthread+0x121/0x140
    [ 2551.528667] ? xb_read+0x1d0/0x1d0
    [ 2551.528670] ? kthread_park+0x90/0x90
    [ 2551.528673] ret_from_fork+0x35/0x40

    Fix this by doing the cleanup via a workqueue instead.

    Reported-by: James Dingwall
    Fixes: fd8aa9095a95c ("xen: optimize xenbus driver for multiple concurrent xenstore accesses")
    Cc: # 4.11
    Signed-off-by: Juergen Gross
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Juergen Gross
     

02 Oct, 2019

5 commits

  • Today the EFI runtime functions are setup in architecture specific
    code (x86 and arm), with the functions themselves living in drivers/xen
    as they are not architecture dependent.

    As the setup is exactly the same for arm and x86 move the setup to
    drivers/xen, too. This at once removes the need to make the single
    functions global visible.

    Signed-off-by: Juergen Gross
    Reviewed-by: Jan Beulich
    [boris: "Dropped EXPORT_SYMBOL_GPL(xen_efi_runtime_setup)"]
    Signed-off-by: Boris Ostrovsky

    Juergen Gross
     
  • Let's move the clearing to balloon_retrieve(). In
    bp_state increase_reservation(), we now clear the flag a little earlier
    than before, however, this should not matter for XEN.

    Suggested-by: Boris Ostrovsky
    Cc: Boris Ostrovsky
    Cc: Juergen Gross
    Cc: Stefano Stabellini
    Signed-off-by: David Hildenbrand
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    David Hildenbrand
     
  • Let's move the __SetPageOffline() call which all callers perform into
    balloon_append().

    In bp_state decrease_reservation(), pages are now marked PG_offline a
    little later than before, however, this should not matter for XEN.

    Suggested-by: Boris Ostrovsky
    Cc: Boris Ostrovsky
    Cc: Juergen Gross
    Cc: Stefano Stabellini
    Signed-off-by: David Hildenbrand
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    David Hildenbrand
     
  • Let's simply use balloon_append() directly.

    Cc: Boris Ostrovsky
    Cc: Juergen Gross
    Cc: Stefano Stabellini
    Signed-off-by: David Hildenbrand
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    David Hildenbrand
     
  • We are missing a __SetPageOffline(), which is why we can get
    !PageOffline() pages onto the balloon list, where
    alloc_xenballooned_pages() will complain:

    page:ffffea0003e7ffc0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0
    flags: 0xffffe00001000(reserved)
    raw: 000ffffe00001000 dead000000000100 dead000000000200 0000000000000000
    raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
    page dumped because: VM_BUG_ON_PAGE(!PageOffline(page))
    ------------[ cut here ]------------
    kernel BUG at include/linux/page-flags.h:744!
    invalid opcode: 0000 [#1] SMP NOPTI

    Reported-by: Marek Marczykowski-Górecki
    Tested-by: Marek Marczykowski-Górecki
    Fixes: 77c4adf6a6df ("xen/balloon: mark inflated pages PG_offline")
    Cc: stable@vger.kernel.org # v5.1+
    Cc: Boris Ostrovsky
    Cc: Juergen Gross
    Cc: Stefano Stabellini
    Signed-off-by: David Hildenbrand
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    David Hildenbrand
     

27 Sep, 2019

2 commits


20 Sep, 2019

1 commit

  • Pull dma-mapping updates from Christoph Hellwig:

    - add dma-mapping and block layer helpers to take care of IOMMU merging
    for mmc plus subsequent fixups (Yoshihiro Shimoda)

    - rework handling of the pgprot bits for remapping (me)

    - take care of the dma direct infrastructure for swiotlb-xen (me)

    - improve the dma noncoherent remapping infrastructure (me)

    - better defaults for ->mmap, ->get_sgtable and ->get_required_mask
    (me)

    - cleanup mmaping of coherent DMA allocations (me)

    - various misc cleanups (Andy Shevchenko, me)

    * tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping: (41 commits)
    mmc: renesas_sdhi_internal_dmac: Add MMC_CAP2_MERGE_CAPABLE
    mmc: queue: Fix bigger segments usage
    arm64: use asm-generic/dma-mapping.h
    swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page
    swiotlb-xen: simplify cache maintainance
    swiotlb-xen: use the same foreign page check everywhere
    swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable
    xen: remove the exports for xen_{create,destroy}_contiguous_region
    xen/arm: remove xen_dma_ops
    xen/arm: simplify dma_cache_maint
    xen/arm: use dev_is_dma_coherent
    xen/arm: consolidate page-coherent.h
    xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance
    arm: remove wrappers for the generic dma remap helpers
    dma-mapping: introduce a dma_common_find_pages helper
    dma-mapping: always use VM_DMA_COHERENT for generic DMA remap
    vmalloc: lift the arm flag for coherent mappings to common code
    dma-mapping: provide a better default ->get_required_mask
    dma-mapping: remove the dma_declare_coherent_memory export
    remoteproc: don't allow modular build
    ...

    Linus Torvalds
     

13 Sep, 2019

1 commit

  • If MCFG area is not reserved in E820, Xen by default will defer its usage
    until Dom0 registers it explicitly after ACPI parser recognizes it as
    a reserved resource in DSDT. Having it reserved in E820 is not
    mandatory according to "PCI Firmware Specification, rev 3.2" (par. 4.1.2)
    and firmware is free to keep a hole in E820 in that place. Xen doesn't know
    what exactly is inside this hole since it lacks full ACPI view of the
    platform therefore it's potentially harmful to access MCFG region
    without additional checks as some machines are known to provide
    inconsistent information on the size of the region.

    Now xen_mcfg_late() runs after acpi_init() which is too late as some basic
    PCI enumeration starts exactly there as well. Trying to register a device
    prior to MCFG reservation causes multiple problems with PCIe extended
    capability initializations in Xen (e.g. SR-IOV VF BAR sizing). There are
    no convenient hooks for us to subscribe to so register MCFG areas earlier
    upon the first invocation of xen_add_device(). It should be safe to do once
    since all the boot time buses must have their MCFG areas in MCFG table
    already and we don't support PCI bus hot-plug.

    Signed-off-by: Igor Druzhinin
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Igor Druzhinin
     

11 Sep, 2019

5 commits


06 Sep, 2019

1 commit


03 Aug, 2019

1 commit

  • Pull xen fixes from Juergen Gross:

    - a small cleanup

    - a fix for a build error on ARM with some configs

    - a fix of a patch for the Xen gntdev driver

    - three patches for fixing a potential problem in the swiotlb-xen
    driver which Konrad was fine with me carrying them through the Xen
    tree

    * tag 'for-linus-5.3a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen/swiotlb: remember having called xen_create_contiguous_region()
    xen/swiotlb: simplify range_straddles_page_boundary()
    xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
    xen: avoid link error on ARM
    xen/gntdev.c: Replace vm_map_pages() with vm_map_pages_zero()
    xen/pciback: remove set but not used variable 'old_state'

    Linus Torvalds
     

01 Aug, 2019

3 commits

  • Instead of always calling xen_destroy_contiguous_region() in case the
    memory is DMA-able for the used device, do so only in case it has been
    made DMA-able via xen_create_contiguous_region() before.

    This will avoid a lot of xen_destroy_contiguous_region() calls for
    64-bit capable devices.

    As the memory in question is owned by swiotlb-xen the PG_owner_priv_1
    flag of the first allocated page can be used for remembering.

    Signed-off-by: Juergen Gross
    Acked-by: Konrad Rzeszutek Wilk
    Signed-off-by: Juergen Gross

    Juergen Gross
     
  • range_straddles_page_boundary() is open coding several macros from
    include/xen/page.h. Use those instead. Additionally there is no need
    to have check_pages_physically_contiguous() as a separate function as
    it is used only once, so merge it into range_straddles_page_boundary().

    Signed-off-by: Juergen Gross
    Reviewed-by: Boris Ostrovsky
    Acked-by: Konrad Rzeszutek Wilk
    Signed-off-by: Juergen Gross

    Juergen Gross
     
  • The condition in xen_swiotlb_free_coherent() for deciding whether to
    call xen_destroy_contiguous_region() is wrong: in case the region to
    be freed is not contiguous calling xen_destroy_contiguous_region() is
    the wrong thing to do: it would result in inconsistent mappings of
    multiple PFNs to the same MFN. This will lead to various strange
    crashes or data corruption.

    Instead of calling xen_destroy_contiguous_region() in that case a
    warning should be issued as that situation should never occur.

    Cc: stable@vger.kernel.org
    Signed-off-by: Juergen Gross
    Reviewed-by: Boris Ostrovsky
    Reviewed-by: Jan Beulich
    Acked-by: Konrad Rzeszutek Wilk
    Signed-off-by: Juergen Gross

    Juergen Gross
     

31 Jul, 2019

2 commits

  • Building the privcmd code as a loadable module on ARM, we get
    a link error due to the private cache management functions:

    ERROR: "__sync_icache_dcache" [drivers/xen/xen-privcmd.ko] undefined!

    Move the code into a new that is always built in when Xen is enabled,
    as suggested by Juergen Gross and Boris Ostrovsky.

    Signed-off-by: Arnd Bergmann
    Reviewed-by: Stefano Stabellini
    Signed-off-by: Juergen Gross

    Arnd Bergmann
     
  • 'commit df9bde015a72 ("xen/gntdev.c: convert to use vm_map_pages()")'
    breaks gntdev driver. If vma->vm_pgoff > 0, vm_map_pages()
    will:
    - use map->pages starting at vma->vm_pgoff instead of 0
    - verify map->count against vma_pages()+vma->vm_pgoff instead of just
    vma_pages().

    In practice, this breaks using a single gntdev FD for mapping multiple
    grants.

    relevant strace output:
    [pid 857] ioctl(7, IOCTL_GNTDEV_MAP_GRANT_REF, 0x7ffd3407b6d0) = 0
    [pid 857] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 7, 0) =
    0x777f1211b000
    [pid 857] ioctl(7, IOCTL_GNTDEV_SET_UNMAP_NOTIFY, 0x7ffd3407b710) = 0
    [pid 857] ioctl(7, IOCTL_GNTDEV_MAP_GRANT_REF, 0x7ffd3407b6d0) = 0
    [pid 857] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 7,
    0x1000) = -1 ENXIO (No such device or address)

    details here:
    https://github.com/QubesOS/qubes-issues/issues/5199

    The reason is -> ( copying Marek's word from discussion)

    vma->vm_pgoff is used as index passed to gntdev_find_map_index. It's
    basically using this parameter for "which grant reference to map".
    map struct returned by gntdev_find_map_index() describes just the pages
    to be mapped. Specifically map->pages[0] should be mapped at
    vma->vm_start, not vma->vm_start+vma->vm_pgoff*PAGE_SIZE.

    When trying to map grant with index (aka vma->vm_pgoff) > 1,
    __vm_map_pages() will refuse to map it because it will expect map->count
    to be at least vma_pages(vma)+vma->vm_pgoff, while it is exactly
    vma_pages(vma).

    Converting vm_map_pages() to use vm_map_pages_zero() will fix the
    problem.

    Marek has tested and confirmed the same.

    Cc: stable@vger.kernel.org # v5.2+
    Fixes: df9bde015a72 ("xen/gntdev.c: convert to use vm_map_pages()")

    Reported-by: Marek Marczykowski-Górecki
    Signed-off-by: Souptick Joarder
    Tested-by: Marek Marczykowski-Górecki
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Juergen Gross

    Souptick Joarder
     

26 Jul, 2019

1 commit

  • Fixes gcc '-Wunused-but-set-variable' warning:

    drivers/xen/xen-pciback/conf_space_capability.c: In function pm_ctrl_write:
    drivers/xen/xen-pciback/conf_space_capability.c:119:25: warning:
    variable old_state set but not used [-Wunused-but-set-variable]

    It is never used so can be removed.

    Reported-by: Hulk Robot
    Signed-off-by: YueHaibing
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Juergen Gross

    YueHaibing
     

20 Jul, 2019

3 commits

  • Pull xen updates from Juergen Gross:
    "Fixes and features:

    - A series to introduce a common command line parameter for disabling
    paravirtual extensions when running as a guest in virtualized
    environment

    - A fix for int3 handling in Xen pv guests

    - Removal of the Xen-specific tmem driver as support of tmem in Xen
    has been dropped (and it was experimental only)

    - A security fix for running as Xen dom0 (XSA-300)

    - A fix for IRQ handling when offlining cpus in Xen guests

    - Some small cleanups"

    * tag 'for-linus-5.3a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen: let alloc_xenballooned_pages() fail if not enough memory free
    xen/pv: Fix a boot up hang revealed by int3 self test
    x86/xen: Add "nopv" support for HVM guest
    x86/paravirt: Remove const mark from x86_hyper_xen_hvm variable
    xen: Map "xen_nopv" parameter to "nopv" and mark it obsolete
    x86: Add "nopv" parameter to disable PV extensions
    x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __init
    xen: remove tmem driver
    Revert "x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized"
    xen/events: fix binding user event channels to cpus

    Linus Torvalds
     
  • Pull vfs mount updates from Al Viro:
    "The first part of mount updates.

    Convert filesystems to use the new mount API"

    * 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    mnt_init(): call shmem_init() unconditionally
    constify ksys_mount() string arguments
    don't bother with registering rootfs
    init_rootfs(): don't bother with init_ramfs_fs()
    vfs: Convert smackfs to use the new mount API
    vfs: Convert selinuxfs to use the new mount API
    vfs: Convert securityfs to use the new mount API
    vfs: Convert apparmorfs to use the new mount API
    vfs: Convert openpromfs to use the new mount API
    vfs: Convert xenfs to use the new mount API
    vfs: Convert gadgetfs to use the new mount API
    vfs: Convert oprofilefs to use the new mount API
    vfs: Convert ibmasmfs to use the new mount API
    vfs: Convert qib_fs/ipathfs to use the new mount API
    vfs: Convert efivarfs to use the new mount API
    vfs: Convert configfs to use the new mount API
    vfs: Convert binfmt_misc to use the new mount API
    convenience helper: get_tree_single()
    convenience helper get_tree_nodev()
    vfs: Kill sget_userns()
    ...

    Linus Torvalds
     
  • Merge yet more updates from Andrew Morton:
    "The rest of MM and a kernel-wide procfs cleanup.

    Summary of the more significant patches:

    - Patch series "mm/memory_hotplug: Factor out memory block
    devicehandling", v3. David Hildenbrand.

    Some spring-cleaning of the memory hotplug code, notably in
    drivers/base/memory.c

    - "mm: thp: fix false negative of shmem vma's THP eligibility". Yang
    Shi.

    Fix /proc/pid/smaps output for THP pages used in shmem.

    - "resource: fix locking in find_next_iomem_res()" + 1. Nadav Amit.

    Bugfix and speedup for kernel/resource.c

    - Patch series "mm: Further memory block device cleanups", David
    Hildenbrand.

    More spring-cleaning of the memory hotplug code.

    - Patch series "mm: Sub-section memory hotplug support". Dan
    Williams.

    Generalise the memory hotplug code so that pmem can use it more
    completely. Then remove the hacks from the libnvdimm code which
    were there to work around the memory-hotplug code's constraints.

    - "proc/sysctl: add shared variables for range check", Matteo Croce.

    We have about 250 instances of

    int zero;
    ...
    .extra1 = &zero,

    in the tree. This is a tree-wide sweep to make all those private
    "zero"s and "one"s use global variables.

    Alas, it isn't practical to make those two global integers const"

    * emailed patches from Andrew Morton : (38 commits)
    proc/sysctl: add shared variables for range check
    mm: migrate: remove unused mode argument
    mm/sparsemem: cleanup 'section number' data types
    libnvdimm/pfn: stop padding pmem namespaces to section alignment
    libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields
    mm/devm_memremap_pages: enable sub-section remap
    mm: document ZONE_DEVICE memory-model implications
    mm/sparsemem: support sub-section hotplug
    mm/sparsemem: prepare for sub-section ranges
    mm: kill is_dev_zone() helper
    mm/hotplug: kill is_dev_zone() usage in __remove_pages()
    mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap()
    mm/hotplug: prepare shrink_{zone, pgdat}_span for sub-section removal
    mm/sparsemem: add helpers track active portions of a section at boot
    mm/sparsemem: introduce a SECTION_IS_EARLY flag
    mm/sparsemem: introduce struct mem_section_usage
    drivers/base/memory.c: get rid of find_memory_block_hinted()
    mm/memory_hotplug: move and simplify walk_memory_blocks()
    mm/memory_hotplug: rename walk_memory_range() and pass start+size instead of pfns
    mm: make register_mem_sect_under_node() static
    ...

    Linus Torvalds
     

19 Jul, 2019

2 commits

  • In the sysctl code the proc_dointvec_minmax() function is often used to
    validate the user supplied value between an allowed range. This
    function uses the extra1 and extra2 members from struct ctl_table as
    minimum and maximum allowed value.

    On sysctl handler declaration, in every source file there are some
    readonly variables containing just an integer which address is assigned
    to the extra1 and extra2 members, so the sysctl range is enforced.

    The special values 0, 1 and INT_MAX are very often used as range
    boundary, leading duplication of variables like zero=0, one=1,
    int_max=INT_MAX in different source files:

    $ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l
    248

    Add a const int array containing the most commonly used values, some
    macros to refer more easily to the correct array member, and use them
    instead of creating a local one for every object file.

    This is the bloat-o-meter output comparing the old and new binary
    compiled with the default Fedora config:

    # scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
    add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
    Data old new delta
    sysctl_vals - 12 +12
    __kstrtab_sysctl_vals - 12 +12
    max 14 10 -4
    int_max 16 - -16
    one 68 - -68
    zero 128 28 -100
    Total: Before=20583249, After=20583085, chg -0.00%

    [mcroce@redhat.com: tipc: remove two unused variables]
    Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
    [akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
    [arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
    Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
    [akpm@linux-foundation.org: fix fs/eventpoll.c]
    Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.com
    Signed-off-by: Matteo Croce
    Signed-off-by: Arnd Bergmann
    Acked-by: Kees Cook
    Reviewed-by: Aaron Tomlin
    Cc: Matthew Wilcox
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matteo Croce
     
  • Pull swiotlb updates from Konrad Rzeszutek Wilk:
    "One compiler fix, and a bug-fix in swiotlb_nr_tbl() and
    swiotlb_max_segment() to check also for no_iotlb_memory"

    * 'for-linus-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
    swiotlb: fix phys_addr_t overflow warning
    swiotlb: Return consistent SWIOTLB segments/nr_tbl
    swiotlb: Group identical cleanup in swiotlb_cleanup()

    Linus Torvalds
     

18 Jul, 2019

1 commit

  • Instead of trying to allocate pages with GFP_USER in
    add_ballooned_pages() check the available free memory via
    si_mem_available(). GFP_USER is far less limiting memory exhaustion
    than the test via si_mem_available().

    This will avoid dom0 running out of memory due to excessive foreign
    page mappings especially on ARM and on x86 in PVH mode, as those don't
    have a pre-ballooned area which can be used for foreign mappings.

    As the normal ballooning suffers from the same problem don't balloon
    down more than si_mem_available() pages in one iteration. At the same
    time limit the default maximum number of retries.

    This is part of XSA-300.

    Signed-off-by: Juergen Gross

    Juergen Gross