16 May, 2019

1 commit

  • Pull xen updates from Juergen Gross:

    - some minor cleanups

    - two small corrections for Xen on ARM

    - two fixes for Xen PVH guest support

    - a patch for a new command line option to tune virtual timer handling

    * tag 'for-linus-5.2b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen/arm: Use p2m entry with lock protection
    xen/arm: Free p2m entry if fail to add it to RB tree
    xen/pvh: correctly setup the PV EFI interface for dom0
    xen/pvh: set xen_domain_type to HVM in xen_pvh_init
    xenbus: drop useless LIST_HEAD in xenbus_write_watch() and xenbus_file_write()
    xen-netfront: mark expected switch fall-through
    xen: xen-pciback: fix warning Using plain integer as NULL pointer
    x86/xen: Add "xen_timer_slop" command line option

    Linus Torvalds
     

15 May, 2019

4 commits

  • Convert to use vm_map_pages_zero() to map range of kernel memory to user
    vma.

    This driver has ignored vm_pgoff. We could later "fix" these drivers to
    behave according to the normal vm_pgoff offsetting simply by removing the
    _zero suffix on the function name and if that causes regressions, it gives
    us an easy way to revert.

    Link: http://lkml.kernel.org/r/acf678e81d554d01a9b590716ac0ccbdcdf71c25.1552921225.git.jrdr.linux@gmail.com
    Signed-off-by: Souptick Joarder
    Reviewed-by: Boris Ostrovsky
    Cc: David Airlie
    Cc: Heiko Stuebner
    Cc: Joerg Roedel
    Cc: Joonsoo Kim
    Cc: Juergen Gross
    Cc: Kees Cook
    Cc: "Kirill A. Shutemov"
    Cc: Kyungmin Park
    Cc: Marek Szyprowski
    Cc: Matthew Wilcox
    Cc: Mauro Carvalho Chehab
    Cc: Michal Hocko
    Cc: Mike Rapoport
    Cc: Oleksandr Andrushchenko
    Cc: Pawel Osciak
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Robin Murphy
    Cc: Russell King
    Cc: Sandy Huang
    Cc: Stefan Richter
    Cc: Stephen Rothwell
    Cc: Thierry Reding
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Souptick Joarder
     
  • Convert to use vm_map_pages() to map range of kernel memory to user vma.

    map->count is passed to vm_map_pages() and internal API verify map->count
    against count ( count = vma_pages(vma)) for page array boundary overrun
    condition.

    Link: http://lkml.kernel.org/r/88e56e82d2db98705c2d842e9c9806c00b366d67.1552921225.git.jrdr.linux@gmail.com
    Signed-off-by: Souptick Joarder
    Reviewed-by: Boris Ostrovsky
    Cc: David Airlie
    Cc: Heiko Stuebner
    Cc: Joerg Roedel
    Cc: Joonsoo Kim
    Cc: Juergen Gross
    Cc: Kees Cook
    Cc: "Kirill A. Shutemov"
    Cc: Kyungmin Park
    Cc: Marek Szyprowski
    Cc: Matthew Wilcox
    Cc: Mauro Carvalho Chehab
    Cc: Michal Hocko
    Cc: Mike Rapoport
    Cc: Oleksandr Andrushchenko
    Cc: Pawel Osciak
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Robin Murphy
    Cc: Russell King
    Cc: Sandy Huang
    Cc: Stefan Richter
    Cc: Stephen Rothwell
    Cc: Thierry Reding
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Souptick Joarder
     
  • Use the mmu_notifier_range_blockable() helper function instead of directly
    dereferencing the range->blockable field. This is done to make it easier
    to change the mmu_notifier range field.

    This patch is the outcome of the following coccinelle patch:

    %blockable
    +mmu_notifier_range_blockable(I1)
    ...>
    }
    ------------------------------------------------------------------->%

    spatch --in-place --sp-file blockable.spatch --dir .

    Link: http://lkml.kernel.org/r/20190326164747.24405-3-jglisse@redhat.com
    Signed-off-by: Jérôme Glisse
    Reviewed-by: Ralph Campbell
    Reviewed-by: Ira Weiny
    Cc: Christian König
    Cc: Joonas Lahtinen
    Cc: Jani Nikula
    Cc: Rodrigo Vivi
    Cc: Jan Kara
    Cc: Andrea Arcangeli
    Cc: Peter Xu
    Cc: Felix Kuehling
    Cc: Jason Gunthorpe
    Cc: Ross Zwisler
    Cc: Dan Williams
    Cc: Paolo Bonzini
    Cc: Radim Krcmar
    Cc: Michal Hocko
    Cc: Christian Koenig
    Cc: John Hubbard
    Cc: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jérôme Glisse
     
  • To facilitate additional options to get_user_pages_fast() change the
    singular write parameter to be gup_flags.

    This patch does not change any functionality. New functionality will
    follow in subsequent patches.

    Some of the get_user_pages_fast() call sites were unchanged because they
    already passed FOLL_WRITE or 0 for the write parameter.

    NOTE: It was suggested to change the ordering of the get_user_pages_fast()
    arguments to ensure that callers were converted. This breaks the current
    GUP call site convention of having the returned pages be the final
    parameter. So the suggestion was rejected.

    Link: http://lkml.kernel.org/r/20190328084422.29911-4-ira.weiny@intel.com
    Link: http://lkml.kernel.org/r/20190317183438.2057-4-ira.weiny@intel.com
    Signed-off-by: Ira Weiny
    Reviewed-by: Mike Marshall
    Cc: Aneesh Kumar K.V
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Dan Williams
    Cc: "David S. Miller"
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: James Hogan
    Cc: Jason Gunthorpe
    Cc: John Hubbard
    Cc: "Kirill A. Shutemov"
    Cc: Martin Schwidefsky
    Cc: Michal Hocko
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Ralf Baechle
    Cc: Rich Felker
    Cc: Thomas Gleixner
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ira Weiny
     

08 May, 2019

3 commits

  • Pull swiotlb updates from Konrad Rzeszutek Wilk:
    "Cleanups in the swiotlb code and extra debugfs knobs to help with the
    field diagnostics"

    * 'stable/for-linus-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
    swiotlb-xen: ensure we have a single callsite for xen_dma_map_page
    swiotlb-xen: simplify the DMA sync method implementations
    swiotlb-xen: use ->map_page to implement ->map_sg
    swiotlb-xen: make instances match their method names
    swiotlb: save io_tlb_used to local variable before leaving critical section
    swiotlb: dump used and total slots when swiotlb buffer is full

    Linus Torvalds
     
  • Pull block updates from Jens Axboe:
    "Nothing major in this series, just fixes and improvements all over the
    map. This contains:

    - Series of fixes for sed-opal (David, Jonas)

    - Fixes and performance tweaks for BFQ (via Paolo)

    - Set of fixes for bcache (via Coly)

    - Set of fixes for md (via Song)

    - Enabling multi-page for passthrough requests (Ming)

    - Queue release fix series (Ming)

    - Device notification improvements (Martin)

    - Propagate underlying device rotational status in loop (Holger)

    - Removal of mtip32xx trim support, which has been disabled for years
    (Christoph)

    - Improvement and cleanup of nvme command handling (Christoph)

    - Add block SPDX tags (Christoph)

    - Cleanup/hardening of bio/bvec iteration (Christoph)

    - A few NVMe pull requests (Christoph)

    - Removal of CONFIG_LBDAF (Christoph)

    - Various little fixes here and there"

    * tag 'for-5.2/block-20190507' of git://git.kernel.dk/linux-block: (164 commits)
    block: fix mismerge in bvec_advance
    block: don't drain in-progress dispatch in blk_cleanup_queue()
    blk-mq: move cancel of hctx->run_work into blk_mq_hw_sysfs_release
    blk-mq: always free hctx after request queue is freed
    blk-mq: split blk_mq_alloc_and_init_hctx into two parts
    blk-mq: free hw queue's resource in hctx's release handler
    blk-mq: move cancel of requeue_work into blk_mq_release
    blk-mq: grab .q_usage_counter when queuing request from plug code path
    block: fix function name in comment
    nvmet: protect discovery change log event list iteration
    nvme: mark nvme_core_init and nvme_core_exit static
    nvme: move command size checks to the core
    nvme-fabrics: check more command sizes
    nvme-pci: check more command sizes
    nvme-pci: remove an unneeded variable initialization
    nvme-pci: unquiesce admin queue on shutdown
    nvme-pci: shutdown on timeout during deletion
    nvme-pci: fix psdt field for single segment sgls
    nvme-multipath: don't print ANA group state by default
    nvme-multipath: split bios with the ns_head bio_set before submitting
    ...

    Linus Torvalds
     
  • Pull stream_open conversion from Kirill Smelkov:

    - remove unnecessary double nonseekable_open from drivers/char/dtlk.c
    as noticed by Pavel Machek while reviewing nonseekable_open ->
    stream_open mass conversion.

    - the mass conversion patch promised in commit 10dce8af3422 ("fs:
    stream_open - opener for stream-like files so that read and write can
    run simultaneously without deadlock") and is automatically generated
    by running

    $ make coccicheck MODE=patch COCCI=scripts/coccinelle/api/stream_open.cocci

    I've verified each generated change manually - that it is correct to
    convert - and each other nonseekable_open instance left - that it is
    either not correct to convert there, or that it is not converted due
    to current stream_open.cocci limitations. More details on this in the
    patch.

    - finally, change VFS to pass ppos=NULL into .read/.write for files
    that declare themselves streams. It was suggested by Rasmus Villemoes
    and makes sure that if ppos starts to be erroneously used in a stream
    file, such bug won't go unnoticed and will produce an oops instead of
    creating illusion of position change being taken into account.

    Note: this patch does not conflict with "fuse: Add FOPEN_STREAM to
    use stream_open()" that will be hopefully coming via FUSE tree,
    because fs/fuse/ uses new-style .read_iter/.write_iter, and for these
    accessors position is still passed as non-pointer kiocb.ki_pos .

    * tag 'stream_open-5.2' of https://lab.nexedi.com/kirr/linux:
    vfs: pass ppos=NULL to .read()/.write() of FMODE_STREAM files
    *: convert stream-like files from nonseekable_open -> stream_open
    dtlk: remove double call to nonseekable_open

    Linus Torvalds
     

06 May, 2019

1 commit

  • Using scripts/coccinelle/api/stream_open.cocci added in 10dce8af3422
    ("fs: stream_open - opener for stream-like files so that read and write
    can run simultaneously without deadlock"), search and convert to
    stream_open all in-kernel nonseekable_open users for which read and
    write actually do not depend on ppos and where there is no other methods
    in file_operations which assume @offset access.

    I've verified each generated change manually - that it is correct to convert -
    and each other nonseekable_open instance left - that it is either not correct
    to convert there, or that it is not converted due to current stream_open.cocci
    limitations. The script also does not convert files that should be valid to
    convert, but that currently have .llseek = noop_llseek or generic_file_llseek
    for unknown reason despite file being opened with nonseekable_open (e.g.
    drivers/input/mousedev.c)

    Among cases converted 14 were potentially vulnerable to read vs write deadlock
    (see details in 10dce8af3422):

    drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    drivers/infiniband/core/user_mad.c:988:1-17: ERROR: umad_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    drivers/input/misc/uinput.c:401:1-17: ERROR: uinput_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
    net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.

    and the rest were just safe to convert to stream_open because their read and
    write do not use ppos at all and corresponding file_operations do not
    have methods that assume @offset file access(*):

    arch/powerpc/platforms/52xx/mpc52xx_gpt.c:631:8-24: WARNING: mpc52xx_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    arch/powerpc/platforms/cell/spufs/file.c:591:8-24: WARNING: spufs_ibox_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    arch/powerpc/platforms/cell/spufs/file.c:591:8-24: WARNING: spufs_ibox_stat_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    arch/powerpc/platforms/cell/spufs/file.c:591:8-24: WARNING: spufs_mbox_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    arch/powerpc/platforms/cell/spufs/file.c:591:8-24: WARNING: spufs_mbox_stat_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    arch/powerpc/platforms/cell/spufs/file.c:591:8-24: WARNING: spufs_wbox_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    arch/powerpc/platforms/cell/spufs/file.c:591:8-24: WARNING: spufs_wbox_stat_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    arch/um/drivers/harddog_kern.c:88:8-24: WARNING: harddog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    arch/x86/kernel/cpu/microcode/core.c:430:33-49: WARNING: microcode_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/char/ds1620.c:215:8-24: WARNING: ds1620_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/char/dtlk.c:301:1-17: WARNING: dtlk_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/char/ipmi/ipmi_watchdog.c:840:9-25: WARNING: ipmi_wdog_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/char/pcmcia/scr24x_cs.c:95:8-24: WARNING: scr24x_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/char/tb0219.c:246:9-25: WARNING: tb0219_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/firewire/nosy.c:306:8-24: WARNING: nosy_ops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/hwmon/fschmd.c:840:8-24: WARNING: watchdog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/hwmon/w83793.c:1344:8-24: WARNING: watchdog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/infiniband/core/ucma.c:1747:8-24: WARNING: ucma_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/infiniband/core/ucm.c:1178:8-24: WARNING: ucm_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/infiniband/core/uverbs_main.c:1086:8-24: WARNING: uverbs_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/input/joydev.c:282:1-17: WARNING: joydev_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/pci/switch/switchtec.c:393:1-17: WARNING: switchtec_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/platform/chrome/cros_ec_debugfs.c:135:8-24: WARNING: cros_ec_console_log_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/rtc/rtc-ds1374.c:470:9-25: WARNING: ds1374_wdt_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/rtc/rtc-m41t80.c:805:9-25: WARNING: wdt_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/s390/char/tape_char.c:293:2-18: WARNING: tape_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/s390/char/zcore.c:194:8-24: WARNING: zcore_reipl_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/s390/crypto/zcrypt_api.c:528:8-24: WARNING: zcrypt_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/spi/spidev.c:594:1-17: WARNING: spidev_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/staging/pi433/pi433_if.c:974:1-17: WARNING: pi433_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/acquirewdt.c:203:8-24: WARNING: acq_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/advantechwdt.c:202:8-24: WARNING: advwdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/alim1535_wdt.c:252:8-24: WARNING: ali_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/alim7101_wdt.c:217:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/ar7_wdt.c:166:8-24: WARNING: ar7_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/at91rm9200_wdt.c:113:8-24: WARNING: at91wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/ath79_wdt.c:135:8-24: WARNING: ath79_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/bcm63xx_wdt.c:119:8-24: WARNING: bcm63xx_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/cpu5wdt.c:143:8-24: WARNING: cpu5wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/cpwd.c:397:8-24: WARNING: cpwd_fops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/eurotechwdt.c:319:8-24: WARNING: eurwdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/f71808e_wdt.c:528:8-24: WARNING: watchdog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/gef_wdt.c:232:8-24: WARNING: gef_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/geodewdt.c:95:8-24: WARNING: geodewdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/ib700wdt.c:241:8-24: WARNING: ibwdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/ibmasr.c:326:8-24: WARNING: asr_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/indydog.c:80:8-24: WARNING: indydog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/intel_scu_watchdog.c:307:8-24: WARNING: intel_scu_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/iop_wdt.c:104:8-24: WARNING: iop_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/it8712f_wdt.c:330:8-24: WARNING: it8712f_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/ixp4xx_wdt.c:68:8-24: WARNING: ixp4xx_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/ks8695_wdt.c:145:8-24: WARNING: ks8695wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/m54xx_wdt.c:88:8-24: WARNING: m54xx_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/machzwd.c:336:8-24: WARNING: zf_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/mixcomwd.c:153:8-24: WARNING: mixcomwd_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/mtx-1_wdt.c:121:8-24: WARNING: mtx1_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/mv64x60_wdt.c:136:8-24: WARNING: mv64x60_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/nuc900_wdt.c:134:8-24: WARNING: nuc900wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/nv_tco.c:164:8-24: WARNING: nv_tco_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/pc87413_wdt.c:289:8-24: WARNING: pc87413_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/pcwd.c:698:8-24: WARNING: pcwd_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/pcwd.c:737:8-24: WARNING: pcwd_temp_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/pcwd_pci.c:581:8-24: WARNING: pcipcwd_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/pcwd_pci.c:623:8-24: WARNING: pcipcwd_temp_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/pcwd_usb.c:488:8-24: WARNING: usb_pcwd_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/pcwd_usb.c:527:8-24: WARNING: usb_pcwd_temperature_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/pika_wdt.c:121:8-24: WARNING: pikawdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/pnx833x_wdt.c:119:8-24: WARNING: pnx833x_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/rc32434_wdt.c:153:8-24: WARNING: rc32434_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/rdc321x_wdt.c:145:8-24: WARNING: rdc321x_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/riowd.c:79:1-17: WARNING: riowd_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/sa1100_wdt.c:62:8-24: WARNING: sa1100dog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/sbc60xxwdt.c:211:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/sbc7240_wdt.c:139:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/sbc8360.c:274:8-24: WARNING: sbc8360_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/sbc_epx_c3.c:81:8-24: WARNING: epx_c3_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/sbc_fitpc2_wdt.c:78:8-24: WARNING: fitpc2_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/sb_wdog.c:108:1-17: WARNING: sbwdog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/sc1200wdt.c:181:8-24: WARNING: sc1200wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/sc520_wdt.c:261:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/sch311x_wdt.c:319:8-24: WARNING: sch311x_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/scx200_wdt.c:105:8-24: WARNING: scx200_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/smsc37b787_wdt.c:369:8-24: WARNING: wb_smsc_wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/w83877f_wdt.c:227:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/w83977f_wdt.c:301:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/wafer5823wdt.c:200:8-24: WARNING: wafwdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/watchdog_dev.c:828:8-24: WARNING: watchdog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/wdrtas.c:379:8-24: WARNING: wdrtas_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/wdrtas.c:445:8-24: WARNING: wdrtas_temp_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/wdt285.c:104:1-17: WARNING: watchdog_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/wdt977.c:276:8-24: WARNING: wdt977_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/wdt.c:424:8-24: WARNING: wdt_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/wdt.c:484:8-24: WARNING: wdt_temp_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/wdt_pci.c:464:8-24: WARNING: wdtpci_fops: .write() has stream semantic; safe to change nonseekable_open -> stream_open.
    drivers/watchdog/wdt_pci.c:527:8-24: WARNING: wdtpci_temp_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    net/batman-adv/log.c:105:1-17: WARNING: batadv_log_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    sound/core/control.c:57:7-23: WARNING: snd_ctl_f_ops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
    sound/core/rawmidi.c:385:7-23: WARNING: snd_rawmidi_f_ops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    sound/core/seq/seq_clientmgr.c:310:7-23: WARNING: snd_seq_f_ops: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open.
    sound/core/timer.c:1428:7-23: WARNING: snd_timer_f_ops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.

    One can also recheck/review the patch via generating it with explanation comments included via

    $ make coccicheck MODE=patch COCCI=scripts/coccinelle/api/stream_open.cocci SPFLAGS="-D explain"

    (*) This second group also contains cases with read/write deadlocks that
    stream_open.cocci don't yet detect, but which are still valid to convert to
    stream_open since ppos is not used. For example drivers/pci/switch/switchtec.c
    calls wait_for_completion_interruptible() in its .read, but stream_open.cocci
    currently detects only "wait_event*" as blocking.

    Cc: Michael Kerrisk
    Cc: Yongzhi Pan
    Cc: Jonathan Corbet
    Cc: David Vrabel
    Cc: Juergen Gross
    Cc: Miklos Szeredi
    Cc: Tejun Heo
    Cc: Kirill Tkhai
    Cc: Arnd Bergmann
    Cc: Christoph Hellwig
    Cc: Greg Kroah-Hartman
    Cc: Julia Lawall
    Cc: Nikolaus Rath
    Cc: Han-Wen Nienhuys
    Cc: Anatolij Gustschin
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: Anton Ivanov
    Cc: Borislav Petkov
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "James R. Van Zandt"
    Cc: Corey Minyard
    Cc: Harald Welte
    Acked-by: Lubomir Rintel [scr24x_cs]
    Cc: Stefan Richter
    Cc: Johan Hovold
    Cc: David Herrmann
    Cc: Jiri Kosina
    Cc: Benjamin Tissoires
    Cc: Jean Delvare
    Acked-by: Guenter Roeck [watchdog/* hwmon/*]
    Cc: Rudolf Marek
    Cc: Dmitry Torokhov
    Cc: Karsten Keil
    Cc: Jacek Anaszewski
    Cc: Pavel Machek
    Cc: Mauro Carvalho Chehab
    Cc: Kurt Schwemmer
    Acked-by: Logan Gunthorpe [drivers/pci/switch/switchtec]
    Acked-by: Bjorn Helgaas [drivers/pci/switch/switchtec]
    Cc: Benson Leung
    Acked-by: Enric Balletbo i Serra [platform/chrome]
    Cc: Alessandro Zummo
    Acked-by: Alexandre Belloni [rtc/*]
    Cc: Mark Brown
    Cc: Wim Van Sebroeck
    Cc: Florian Fainelli
    Cc: bcm-kernel-feedback-list@broadcom.com
    Cc: Wan ZongShun
    Cc: Zwane Mwaikambo
    Cc: Marek Lindner
    Cc: Simon Wunderlich
    Cc: Antonio Quartulli
    Cc: "David S. Miller"
    Cc: Johannes Berg
    Cc: Jaroslav Kysela
    Cc: Takashi Iwai
    Signed-off-by: Kirill Smelkov

    Kirill Smelkov
     

03 May, 2019

4 commits


25 Apr, 2019

1 commit


23 Apr, 2019

1 commit


22 Apr, 2019

1 commit

  • Pull in v5.1-rc6 to resolve two conflicts. One is in BFQ, in just a
    comment, and is trivial. The other one is a conflict due to a later fix
    in the bio multi-page work, and needs a bit more care.

    * tag 'v5.1-rc6': (770 commits)
    Linux 5.1-rc6
    block: make sure that bvec length can't be overflow
    block: kill all_q_node in request_queue
    x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log priority
    coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
    mm/kmemleak.c: fix unused-function warning
    init: initialize jump labels before command line option parsing
    kernel/watchdog_hld.c: hard lockup message should end with a newline
    kcov: improve CONFIG_ARCH_HAS_KCOV help text
    mm: fix inactive list balancing between NUMA nodes and cgroups
    mm/hotplug: treat CMA pages as unmovable
    proc: fixup proc-pid-vm test
    proc: fix map_files test on F29
    mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
    mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock
    mm: swapoff: shmem_unuse() stop eviction without igrab()
    mm: swapoff: take notice of completion sooner
    mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES
    mm: swapoff: shmem_find_swap_entries() filter out other types
    slab: store tagged freelist for off-slab slabmgmt
    ...

    Signed-off-by: Jens Axboe

    Jens Axboe
     

17 Apr, 2019

1 commit

  • irq_ctx_init() is invoked from native_init_IRQ() or from xen_init_IRQ()
    code. There is no reason to have this split. The interrupt stacks must be
    allocated no matter what.

    Invoke it from init_IRQ() before invoking the native or XEN init
    implementation.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Borislav Petkov
    Reviewed-by: Juergen Gross
    Cc: Andy Lutomirski
    Cc: Boris Ostrovsky
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Josh Abraham
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Mike Rapoport
    Cc: Nicolai Stange
    Cc: Sean Christopherson
    Cc: Stefano Stabellini
    Cc: Stephen Rothwell
    Cc: x86-ml
    Cc: xen-devel@lists.xenproject.org
    Link: https://lkml.kernel.org/r/20190414160146.001162606@linutronix.de

    Thomas Gleixner
     

08 Apr, 2019

1 commit


07 Apr, 2019

1 commit

  • …multaneously without deadlock

    Commit 9c225f2655e3 ("vfs: atomic f_pos accesses as per POSIX") added
    locking for file.f_pos access and in particular made concurrent read and
    write not possible - now both those functions take f_pos lock for the
    whole run, and so if e.g. a read is blocked waiting for data, write will
    deadlock waiting for that read to complete.

    This caused regression for stream-like files where previously read and
    write could run simultaneously, but after that patch could not do so
    anymore. See e.g. commit 581d21a2d02a ("xenbus: fix deadlock on writes
    to /proc/xen/xenbus") which fixes such regression for particular case of
    /proc/xen/xenbus.

    The patch that added f_pos lock in 2014 did so to guarantee POSIX thread
    safety for read/write/lseek and added the locking to file descriptors of
    all regular files. In 2014 that thread-safety problem was not new as it
    was already discussed earlier in 2006.

    However even though 2006'th version of Linus's patch was adding f_pos
    locking "only for files that are marked seekable with FMODE_LSEEK (thus
    avoiding the stream-like objects like pipes and sockets)", the 2014
    version - the one that actually made it into the tree as 9c225f2655e3 -
    is doing so irregardless of whether a file is seekable or not.

    See

    https://lore.kernel.org/lkml/53022DB1.4070805@gmail.com/
    https://lwn.net/Articles/180387
    https://lwn.net/Articles/180396

    for historic context.

    The reason that it did so is, probably, that there are many files that
    are marked non-seekable, but e.g. their read implementation actually
    depends on knowing current position to correctly handle the read. Some
    examples:

    kernel/power/user.c snapshot_read
    fs/debugfs/file.c u32_array_read
    fs/fuse/control.c fuse_conn_waiting_read + ...
    drivers/hwmon/asus_atk0110.c atk_debugfs_ggrp_read
    arch/s390/hypfs/inode.c hypfs_read_iter
    ...

    Despite that, many nonseekable_open users implement read and write with
    pure stream semantics - they don't depend on passed ppos at all. And for
    those cases where read could wait for something inside, it creates a
    situation similar to xenbus - the write could be never made to go until
    read is done, and read is waiting for some, potentially external, event,
    for potentially unbounded time -> deadlock.

    Besides xenbus, there are 14 such places in the kernel that I've found
    with semantic patch (see below):

    drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock .write()
    drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can deadlock .write()
    drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock .write()
    drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() can deadlock .write()
    net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock .write()
    drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can deadlock .write()
    drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can deadlock .write()
    drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock .write()
    net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can deadlock .write()
    drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can deadlock .write()
    drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock .write()
    drivers/input/misc/uinput.c:400:1-17: ERROR: uinput_fops: .read() can deadlock .write()
    drivers/infiniband/core/user_mad.c:985:7-23: ERROR: umad_fops: .read() can deadlock .write()
    drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock .write()

    In addition to the cases above another regression caused by f_pos
    locking is that now FUSE filesystems that implement open with
    FOPEN_NONSEEKABLE flag, can no longer implement bidirectional
    stream-like files - for the same reason as above e.g. read can deadlock
    write locking on file.f_pos in the kernel.

    FUSE's FOPEN_NONSEEKABLE was added in 2008 in a7c1b990f715 ("fuse:
    implement nonseekable open") to support OSSPD. OSSPD implements /dev/dsp
    in userspace with FOPEN_NONSEEKABLE flag, with corresponding read and
    write routines not depending on current position at all, and with both
    read and write being potentially blocking operations:

    See

    https://github.com/libfuse/osspd
    https://lwn.net/Articles/308445

    https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1406
    https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1438-L1477
    https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1479-L1510

    Corresponding libfuse example/test also describes FOPEN_NONSEEKABLE as
    "somewhat pipe-like files ..." with read handler not using offset.
    However that test implements only read without write and cannot exercise
    the deadlock scenario:

    https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L124-L131
    https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L146-L163
    https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L209-L216

    I've actually hit the read vs write deadlock for real while implementing
    my FUSE filesystem where there is /head/watch file, for which open
    creates separate bidirectional socket-like stream in between filesystem
    and its user with both read and write being later performed
    simultaneously. And there it is semantically not easy to split the
    stream into two separate read-only and write-only channels:

    https://lab.nexedi.com/kirr/wendelin.core/blob/f13aa600/wcfs/wcfs.go#L88-169

    Let's fix this regression. The plan is:

    1. We can't change nonseekable_open to include &~FMODE_ATOMIC_POS -
    doing so would break many in-kernel nonseekable_open users which
    actually use ppos in read/write handlers.

    2. Add stream_open() to kernel to open stream-like non-seekable file
    descriptors. Read and write on such file descriptors would never use
    nor change ppos. And with that property on stream-like files read and
    write will be running without taking f_pos lock - i.e. read and write
    could be running simultaneously.

    3. With semantic patch search and convert to stream_open all in-kernel
    nonseekable_open users for which read and write actually do not
    depend on ppos and where there is no other methods in file_operations
    which assume @offset access.

    4. Add FOPEN_STREAM to fs/fuse/ and open in-kernel file-descriptors via
    steam_open if that bit is present in filesystem open reply.

    It was tempting to change fs/fuse/ open handler to use stream_open
    instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but
    grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE,
    and in particular GVFS which actually uses offset in its read and
    write handlers

    https://codesearch.debian.net/search?q=-%3Enonseekable+%3D
    https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080
    https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346
    https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481

    so if we would do such a change it will break a real user.

    5. Add stream_open and FOPEN_STREAM handling to stable kernels starting
    from v3.14+ (the kernel where 9c225f2655 first appeared).

    This will allow to patch OSSPD and other FUSE filesystems that
    provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE
    in their open handler and this way avoid the deadlock on all kernel
    versions. This should work because fs/fuse/ ignores unknown open
    flags returned from a filesystem and so passing FOPEN_STREAM to a
    kernel that is not aware of this flag cannot hurt. In turn the kernel
    that is not aware of FOPEN_STREAM will be < v3.14 where just
    FOPEN_NONSEEKABLE is sufficient to implement streams without read vs
    write deadlock.

    This patch adds stream_open, converts /proc/xen/xenbus to it and adds
    semantic patch to automatically locate in-kernel places that are either
    required to be converted due to read vs write deadlock, or that are just
    safe to be converted because read and write do not use ppos and there
    are no other funky methods in file_operations.

    Regarding semantic patch I've verified each generated change manually -
    that it is correct to convert - and each other nonseekable_open instance
    left - that it is either not correct to convert there, or that it is not
    converted due to current stream_open.cocci limitations.

    The script also does not convert files that should be valid to convert,
    but that currently have .llseek = noop_llseek or generic_file_llseek for
    unknown reason despite file being opened with nonseekable_open (e.g.
    drivers/input/mousedev.c)

    Cc: Michael Kerrisk <mtk.manpages@gmail.com>
    Cc: Yongzhi Pan <panyongzhi@gmail.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: David Vrabel <david.vrabel@citrix.com>
    Cc: Juergen Gross <jgross@suse.com>
    Cc: Miklos Szeredi <miklos@szeredi.hu>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Julia Lawall <Julia.Lawall@lip6.fr>
    Cc: Nikolaus Rath <Nikolaus@rath.org>
    Cc: Han-Wen Nienhuys <hanwen@google.com>
    Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

    Kirill Smelkov
     

05 Apr, 2019

1 commit

  • struct privcmd_buf_vma_private has a zero-sized array at the end
    (pages), use the new struct_size() helper to determine the proper
    allocation size and avoid potential type mistakes.

    Signed-off-by: Andrea Righi
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross

    Andrea Righi
     

02 Apr, 2019

1 commit

  • xen_biovec_phys_mergeable() only needs .bv_page of the 2nd bio bvec
    for checking if the two bvecs can be merged, so pass page to
    xen_biovec_phys_mergeable() directly.

    No function change.

    Cc: ris Ostrovsky
    Cc: Juergen Gross
    Cc: xen-devel@lists.xenproject.org
    Cc: Omar Sandoval
    Cc: Christoph Hellwig
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     

15 Mar, 2019

1 commit

  • The XEN balloon driver - in contrast to other balloon drivers - allows
    to map some inflated pages to user space. Such pages are allocated via
    alloc_xenballooned_pages() and freed via free_xenballooned_pages().
    The pfn space of these allocated pages is used to map other things
    by the hypervisor using hypercalls.

    Pages marked with PG_offline must never be mapped to user space (as
    this page type uses the mapcount field of struct pages).

    So what we can do is, clear/set PG_offline when allocating/freeing an
    inflated pages. This way, most inflated pages can be excluded by
    dumping tools and the "reused for other purpose" balloon pages are
    correctly not marked as PG_offline.

    Fixes: 77c4adf6a6df (xen/balloon: mark inflated pages PG_offline)
    Reported-by: Julien Grall
    Tested-by: Julien Grall
    Signed-off-by: David Hildenbrand
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross

    David Hildenbrand
     

13 Mar, 2019

2 commits

  • Merge misc updates from Andrew Morton:

    - a few misc things

    - the rest of MM

    - remove flex_arrays, replace with new simple radix-tree implementation

    * emailed patches from Andrew Morton : (38 commits)
    Drop flex_arrays
    sctp: convert to genradix
    proc: commit to genradix
    generic radix trees
    selinux: convert to kvmalloc
    md: convert to kvmalloc
    openvswitch: convert to kvmalloc
    of: fix kmemleak crash caused by imbalance in early memory reservation
    mm: memblock: update comments and kernel-doc
    memblock: split checks whether a region should be skipped to a helper function
    memblock: remove memblock_{set,clear}_region_flags
    memblock: drop memblock_alloc_*_nopanic() variants
    memblock: memblock_alloc_try_nid: don't panic
    treewide: add checks for the return value of memblock_alloc*()
    swiotlb: add checks for the return value of memblock_alloc*()
    init/main: add checks for the return value of memblock_alloc*()
    mm/percpu: add checks for the return value of memblock_alloc*()
    sparc: add checks for the return value of memblock_alloc*()
    ia64: add checks for the return value of memblock_alloc*()
    arch: don't memset(0) memory returned by memblock_alloc()
    ...

    Linus Torvalds
     
  • Add check for the return value of memblock_alloc*() functions and call
    panic() in case of error. The panic message repeats the one used by
    panicing memblock allocators with adjustment of parameters to include
    only relevant ones.

    The replacement was mostly automated with semantic patches like the one
    below with manual massaging of format strings.

    @@
    expression ptr, size, align;
    @@
    ptr = memblock_alloc(size, align);
    + if (!ptr)
    + panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, size, align);

    [anders.roxell@linaro.org: use '%pa' with 'phys_addr_t' type]
    Link: http://lkml.kernel.org/r/20190131161046.21886-1-anders.roxell@linaro.org
    [rppt@linux.ibm.com: fix format strings for panics after memblock_alloc]
    Link: http://lkml.kernel.org/r/1548950940-15145-1-git-send-email-rppt@linux.ibm.com
    [rppt@linux.ibm.com: don't panic if the allocation in sparse_buffer_init fails]
    Link: http://lkml.kernel.org/r/20190131074018.GD28876@rapoport-lnx
    [akpm@linux-foundation.org: fix xtensa printk warning]
    Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Signed-off-by: Anders Roxell
    Reviewed-by: Guo Ren [c-sky]
    Acked-by: Paul Burton [MIPS]
    Acked-by: Heiko Carstens [s390]
    Reviewed-by: Juergen Gross [Xen]
    Reviewed-by: Geert Uytterhoeven [m68k]
    Acked-by: Max Filippov [xtensa]
    Cc: Catalin Marinas
    Cc: Christophe Leroy
    Cc: Christoph Hellwig
    Cc: "David S. Miller"
    Cc: Dennis Zhou
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Petr Mladek
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Rob Herring
    Cc: Rob Herring
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

12 Mar, 2019

1 commit

  • Pull xen updates from Juergen Gross:
    "xen fixes and features:

    - remove fallback code for very old Xen hypervisors

    - three patches for fixing Xen dom0 boot regressions

    - an old patch for Xen PCI passthrough which was never applied for
    unknown reasons

    - some more minor fixes and cleanup patches"

    * tag 'for-linus-5.1a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen: fix dom0 boot on huge systems
    xen, cpu_hotplug: Prevent an out of bounds access
    xen: remove pre-xen3 fallback handlers
    xen/ACPI: Switch to bitmap_zalloc()
    x86/xen: dont add memory above max allowed allocation
    x86: respect memory size limiting via mem= parameter
    xen/gntdev: Check and release imported dma-bufs on close
    xen/gntdev: Do not destroy context while dma-bufs are in use
    xen/pciback: Don't disable PCI_COMMAND on PCI device reset.
    xen-scsiback: mark expected switch fall-through
    xen: mark expected switch fall-through

    Linus Torvalds
     

10 Mar, 2019

1 commit

  • Pull SCSI updates from James Bottomley:
    "This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
    hisi_sas, target/iscsi and target/core.

    Additionally Christoph refactored gdth as part of the dma changes. The
    major mid-layer change this time is the removal of bidi commands and
    with them the whole of the osd/exofs driver and filesystem. This is a
    major simplification for block and mq in particular"

    * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (240 commits)
    scsi: cxgb4i: validate tcp sequence number only if chip version pf
    scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c
    scsi: mpt3sas: Add missing breaks in switch statements
    scsi: aacraid: Fix missing break in switch statement
    scsi: kill command serial number
    scsi: csiostor: drop serial_number usage
    scsi: mvumi: use request tag instead of serial_number
    scsi: dpt_i2o: remove serial number usage
    scsi: st: osst: Remove negative constant left-shifts
    scsi: ufs-bsg: Allow reading descriptors
    scsi: ufs: Allow reading descriptor via raw upiu
    scsi: ufs-bsg: Change the calling convention for write descriptor
    scsi: ufs: Remove unused device quirks
    Revert "scsi: ufs: disable vccq if it's not needed by UFS device"
    scsi: megaraid_sas: Remove a bunch of set but not used variables
    scsi: clean obsolete return values of eh_timed_out
    scsi: sd: Optimal I/O size should be a multiple of physical block size
    scsi: MAINTAINERS: SCSI initiator and target tweaks
    scsi: fcoe: make use of fip_mode enum complete
    ...

    Linus Torvalds
     

09 Mar, 2019

1 commit

  • The "cpu" variable comes from the sscanf() so Smatch marks it as
    untrusted data. We can't pass a higher value than "nr_cpu_ids" to
    cpu_possible() or it results in an out of bounds access.

    Fixes: d68d82afd4c8 ("xen: implement CPU hotplugging")
    Signed-off-by: Dan Carpenter
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross

    Dan Carpenter
     

06 Mar, 2019

2 commits

  • Mark inflated and never onlined pages PG_offline, to tell the world that
    the content is stale and should not be dumped.

    Link: http://lkml.kernel.org/r/20181119101616.8901-5-david@redhat.com
    Signed-off-by: David Hildenbrand
    Reviewed-by: Juergen Gross
    Cc: Boris Ostrovsky
    Cc: Stefano Stabellini
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Cc: "Michael S. Tsirkin"
    Cc: Alexander Duyck
    Cc: Alexey Dobriyan
    Cc: Arnd Bergmann
    Cc: Baoquan He
    Cc: Borislav Petkov
    Cc: Christian Hansen
    Cc: Dave Young
    Cc: David Rientjes
    Cc: Greg Kroah-Hartman
    Cc: Haiyang Zhang
    Cc: Jonathan Corbet
    Cc: Julien Freche
    Cc: Kairui Song
    Cc: Kazuhito Hagio
    Cc: "Kirill A. Shutemov"
    Cc: Konstantin Khlebnikov
    Cc: "K. Y. Srinivasan"
    Cc: Len Brown
    Cc: Lianbo Jiang
    Cc: Michal Hocko
    Cc: Mike Rapoport
    Cc: Miles Chen
    Cc: Nadav Amit
    Cc: Naoya Horiguchi
    Cc: Omar Sandoval
    Cc: Pankaj gupta
    Cc: Pavel Machek
    Cc: Pavel Tatashin
    Cc: Rafael J. Wysocki
    Cc: "Rafael J. Wysocki"
    Cc: Stephen Hemminger
    Cc: Stephen Rothwell
    Cc: Vitaly Kuznetsov
    Cc: Vlastimil Babka
    Cc: Xavier Deguillard
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • When freeing pages are done with higher order, time spent on coalescing
    pages by buddy allocator can be reduced. With section size of 256MB,
    hot add latency of a single section shows improvement from 50-60 ms to
    less than 1 ms, hence improving the hot add latency by 60 times. Modify
    external providers of online callback to align with the change.

    [arunks@codeaurora.org: v11]
    Link: http://lkml.kernel.org/r/1547792588-18032-1-git-send-email-arunks@codeaurora.org
    [akpm@linux-foundation.org: remove unused local, per Arun]
    [akpm@linux-foundation.org: avoid return of void-returning __free_pages_core(), per Oscar]
    [akpm@linux-foundation.org: fix it for mm-convert-totalram_pages-and-totalhigh_pages-variables-to-atomic.patch]
    [arunks@codeaurora.org: v8]
    Link: http://lkml.kernel.org/r/1547032395-24582-1-git-send-email-arunks@codeaurora.org
    [arunks@codeaurora.org: v9]
    Link: http://lkml.kernel.org/r/1547098543-26452-1-git-send-email-arunks@codeaurora.org
    Link: http://lkml.kernel.org/r/1538727006-5727-1-git-send-email-arunks@codeaurora.org
    Signed-off-by: Arun KS
    Reviewed-by: Andrew Morton
    Acked-by: Michal Hocko
    Reviewed-by: Oscar Salvador
    Reviewed-by: Alexander Duyck
    Cc: K. Y. Srinivasan
    Cc: Haiyang Zhang
    Cc: Stephen Hemminger
    Cc: Boris Ostrovsky
    Cc: Juergen Gross
    Cc: Dan Williams
    Cc: Vlastimil Babka
    Cc: Joonsoo Kim
    Cc: Greg Kroah-Hartman
    Cc: Mathieu Malaterre
    Cc: "Kirill A. Shutemov"
    Cc: Souptick Joarder
    Cc: Mel Gorman
    Cc: Aaron Lu
    Cc: Srivatsa Vaddagiri
    Cc: Vinayak Menon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun KS
     

05 Mar, 2019

1 commit

  • The legacy hypercall handlers were originally added with
    a comment explaining that "copying the argument structures in
    HYPERVISOR_event_channel_op() and HYPERVISOR_physdev_op() into the local
    variable is sufficiently safe" and only made sure to not write
    past the end of the argument structure, the checks in linux/string.h
    disagree with that, when link-time optimizations are used:

    In function 'memcpy',
    inlined from 'pirq_query_unmask' at drivers/xen/fallback.c:53:2,
    inlined from '__startup_pirq' at drivers/xen/events/events_base.c:529:2,
    inlined from 'restore_pirqs' at drivers/xen/events/events_base.c:1439:3,
    inlined from 'xen_irq_resume' at drivers/xen/events/events_base.c:1581:2:
    include/linux/string.h:350:3: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter
    __read_overflow2();
    ^

    Further research turned out that only Xen 3.0.2 or earlier required the
    fallback at all, while all versions in use today don't need it.
    As far as I can tell, it is not even possible to run a mainline kernel
    on those old Xen releases, at the time when they were in use, only
    a patched kernel was supported anyway.

    Fixes: cf47a83fb06e ("xen/hypercall: fix hypercall fallback code for very old hypervisors")
    Reviewed-by: Boris Ostrovsky
    Cc: Jan Beulich
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Juergen Gross

    Arnd Bergmann
     

04 Mar, 2019

1 commit


18 Feb, 2019

6 commits

  • Don't allow memory to be added above the allowed maximum allocation
    limit set by Xen.

    Trying to do so would result in cases like the following:

    [ 584.559652] ------------[ cut here ]------------
    [ 584.564897] WARNING: CPU: 2 PID: 1 at ../arch/x86/xen/multicalls.c:129 xen_alloc_pte+0x1c7/0x390()
    [ 584.575151] Modules linked in:
    [ 584.578643] Supported: Yes
    [ 584.581750] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.4.120-92.70-default #1
    [ 584.590000] Hardware name: Cisco Systems Inc UCSC-C460-M4/UCSC-C460-M4, BIOS C460M4.4.0.1b.0.0629181419 06/29/2018
    [ 584.601862] 0000000000000000 ffffffff813175a0 0000000000000000 ffffffff8184777c
    [ 584.610200] ffffffff8107f4e1 ffff880487eb7000 ffff8801862b79c0 ffff88048608d290
    [ 584.618537] 0000000000487eb7 ffffea0000000201 ffffffff81009de7 ffffffff81068561
    [ 584.626876] Call Trace:
    [ 584.629699] [] dump_trace+0x59/0x340
    [ 584.635645] [] show_stack_log_lvl+0xea/0x170
    [ 584.642391] [] show_stack+0x21/0x40
    [ 584.648238] [] dump_stack+0x5c/0x7c
    [ 584.654085] [] warn_slowpath_common+0x81/0xb0
    [ 584.660932] [] xen_alloc_pte+0x1c7/0x390
    [ 584.667289] [] pmd_populate_kernel.constprop.6+0x40/0x80
    [ 584.675241] [] phys_pmd_init+0x210/0x255
    [ 584.681587] [] phys_pud_init+0x1da/0x247
    [ 584.687931] [] kernel_physical_mapping_init+0xf5/0x1d4
    [ 584.695682] [] init_memory_mapping+0x18d/0x380
    [ 584.702631] [] arch_add_memory+0x59/0xf0

    Signed-off-by: Juergen Gross
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Juergen Gross

    Juergen Gross
     
  • Check if there are any imported dma-bufs left not released by
    user-space when grant device's release callback is called and
    free those if this is the case. This can happen if user-space
    leaks the buffers because of a bug or application has been
    terminated for any reason.

    Signed-off-by: Oleksandr Andrushchenko
    Reviewed-by: Boris Ostrovsky@oracle.com>
    Signed-off-by: Juergen Gross

    Oleksandr Andrushchenko
     
  • If there are exported DMA buffers which are still in use and
    grant device is closed by either normal user-space close or by
    a signal this leads to the grant device context to be destroyed,
    thus making it not possible to correctly destroy those exported
    buffers when they are returned back to gntdev and makes the module
    crash:

    [ 339.617540] [] dmabuf_exp_ops_release+0x40/0xa8
    [ 339.617560] [] dma_buf_release+0x60/0x190
    [ 339.617577] [] __fput+0x88/0x1d0
    [ 339.617589] [] ____fput+0xc/0x18
    [ 339.617607] [] task_work_run+0x9c/0xc0
    [ 339.617622] [] do_notify_resume+0xfc/0x108

    Fix this by referencing gntdev on each DMA buffer export and
    unreferencing on buffer release.

    Signed-off-by: Oleksandr Andrushchenko
    Reviewed-by: Boris Ostrovsky@oracle.com>
    Signed-off-by: Juergen Gross

    Oleksandr Andrushchenko
     
  • There is no need for this at all. Worst it means that if
    the guest tries to write to BARs it could lead (on certain
    platforms) to PCI SERR errors.

    Please note that with af6fc858a35b90e89ea7a7ee58e66628c55c776b
    "xen-pciback: limit guest control of command register"
    a guest is still allowed to enable those control bits (safely), but
    is not allowed to disable them and that therefore a well behaved
    frontend which enables things before using them will still
    function correctly.

    This is done via an write to the configuration register 0x4 which
    triggers on the backend side:
    command_write
    \- pci_enable_device
    \- pci_enable_device_flags
    \- do_pci_enable_device
    \- pcibios_enable_device
    \-pci_enable_resourcess
    [which enables the PCI_COMMAND_MEMORY|PCI_COMMAND_IO]

    However guests (and drivers) which don't do this could cause
    problems, including the security issues which XSA-120 sought
    to address.

    Reported-by: Jan Beulich
    Signed-off-by: Konrad Rzeszutek Wilk
    Reviewed-by: Prarit Bhargava
    Signed-off-by: Juergen Gross

    Konrad Rzeszutek Wilk
     
  • In preparation to enabling -Wimplicit-fallthrough, mark switch
    cases where we are expecting to fall through.

    This patch fixes the following warning:

    drivers/xen/xen-scsiback.c: In function ‘scsiback_frontend_changed’:
    drivers/xen/xen-scsiback.c:1185:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
    if (xenbus_dev_is_online(dev))
    ^
    drivers/xen/xen-scsiback.c:1188:2: note: here
    case XenbusStateUnknown:
    ^~~~

    Warning level 3 was used: -Wimplicit-fallthrough=3

    Notice that, in this particular case, the code comment is modified
    in accordance with what GCC is expecting to find.

    This patch is part of the ongoing efforts to enable
    -Wimplicit-fallthrough.

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: Juergen Gross

    Gustavo A. R. Silva
     
  • In preparation to enabling -Wimplicit-fallthrough, mark switch
    cases where we are expecting to fall through.

    This patch fixes the following warning:

    drivers/xen/xen-pciback/xenbus.c: In function ‘xen_pcibk_frontend_changed’:
    drivers/xen/xen-pciback/xenbus.c:545:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
    if (xenbus_dev_is_online(xdev))
    ^
    drivers/xen/xen-pciback/xenbus.c:548:2: note: here
    case XenbusStateUnknown:
    ^~~~

    Warning level 3 was used: -Wimplicit-fallthrough=3

    Notice that, in this particular case, the code comment is modified
    in accordance with what GCC is expecting to find.

    This patch is part of the ongoing efforts to enable
    -Wimplicit-fallthrough.

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: Juergen Gross

    Gustavo A. R. Silva
     

05 Feb, 2019

1 commit

  • Due to the patch that makes TMF handling synchronous the
    write_pending_status() callback function is no longer called. Hence remove
    it.

    Acked-by: Felipe Balbi
    Reviewed-by: Sagi Grimberg
    Reviewed-by: Andy Grover
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Bryant G. Ly
    Cc: Nicholas Bellinger
    Cc: Mike Christie
    Cc: Himanshu Madhani
    Cc: Quinn Tran
    Cc: Saurav Kashyap
    Cc: Michael S. Tsirkin
    Cc: Juergen Gross
    Signed-off-by: Bart Van Assche
    Signed-off-by: Martin K. Petersen

    Bart Van Assche
     

24 Jan, 2019

1 commit

  • Xen-swiotlb hooks into the arm/arm64 arch code through a copy of the DMA
    DMA mapping operations stored in the struct device arch data.

    Switching arm64 to use the direct calls for the merged DMA direct /
    swiotlb code broke this scheme. Replace the indirect calls with
    direct-calls in xen-swiotlb as well to fix this problem.

    Fixes: 356da6d0cde3 ("dma-mapping: bypass indirect calls for dma-direct")
    Reported-by: Julien Grall
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Stefano Stabellini

    Christoph Hellwig
     

19 Jan, 2019

1 commit

  • Pull xen fixes from Juergen Gross:

    - Several fixes for the Xen pvcalls drivers (1 fix for the backend and
    8 for the frontend).

    - A fix for a rather longstanding bug in the Xen sched_clock()
    interface which led to weird time jumps when migrating the system.

    - A fix for avoiding accesses to x2apic MSRs in Xen PV guests.

    * tag 'for-linus-5.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen: Fix x86 sched_clock() interface for xen
    pvcalls-front: fix potential null dereference
    always clear the X2APIC_ENABLE bit for PV guest
    pvcalls-front: Avoid get_free_pages(GFP_KERNEL) under spinlock
    xen/pvcalls: remove set but not used variable 'intf'
    pvcalls-back: set -ENOTCONN in pvcalls_conn_back_read
    pvcalls-front: don't return error when the ring is full
    pvcalls-front: properly allocate sk
    pvcalls-front: don't try to free unallocated rings
    pvcalls-front: read all data before closing the connection

    Linus Torvalds