19 Mar, 2019

1 commit

  • commit 7fbe078c37aba3088359c9256c1a1d0c3e39ee81 upstream.

    The vsock core only supports 32bit CID, but the Virtio-vsock spec define
    CID (dst_cid and src_cid) as u64 and the upper 32bits is reserved as
    zero. This inconsistency causes one bug in vhost vsock driver. The
    scenarios is:

    0. A hash table (vhost_vsock_hash) is used to map an CID to a vsock
    object. And hash_min() is used to compute the hash key. hash_min() is
    defined as:
    (sizeof(val)
    Reviewed-by: Liu Jiang
    Reviewed-by: Stefan Hajnoczi
    Acked-by: Jason Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Shengjing Zhu
    Signed-off-by: Greg Kroah-Hartman

    Zha Bin
     

27 Feb, 2019

1 commit

  • [ Upstream commit 74ad7419489ddade8044e3c9ab064ad656520306 ]

    We've failed to copy and process vhost_iotlb_msg so let userspace at
    least know about it. For instance before these patch the code below runs
    without any error:

    int main()
    {
    struct vhost_msg msg;
    struct iovec iov;
    int fd;

    fd = open("/dev/vhost-net", O_RDWR);
    if (fd == -1) {
    perror("open");
    return 1;
    }

    iov.iov_base = &msg;
    iov.iov_len = sizeof(msg)-4;

    if (writev(fd, &iov,1) == -1) {
    perror("writev");
    return 1;
    }

    return 0;
    }

    Signed-off-by: Pavel Tikhomirov
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Sasha Levin

    Pavel Tikhomirov
     

23 Feb, 2019

1 commit

  • [ Upstream commit 816db7663565cd23f74ed3d5c9240522e3fb0dda ]

    When fail, translate_desc() returns negative value, otherwise the
    number of iovs. So we should fail when the return value is negative
    instead of a blindly check against zero.

    Detected by CoverityScan, CID# 1442593: Control flow issues (DEADCODE)

    Fixes: cc5e71075947 ("vhost: log dirty page correctly")
    Acked-by: Michael S. Tsirkin
    Reported-by: Stephen Hemminger
    Signed-off-by: Jason Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Jason Wang
     

07 Feb, 2019

1 commit

  • [ Upstream commit b46a0bf78ad7b150ef5910da83859f7f5a514ffd ]

    After batched used ring updating was introduced in commit e2b3b35eb989
    ("vhost_net: batch used ring update in rx"). We tend to batch heads in
    vq->heads for more than one packet. But the quota passed to
    get_rx_bufs() was not correctly limited, which can result a OOB write
    in vq->heads.

    headcount = get_rx_bufs(vq, vq->heads + nvq->done_idx,
    vhost_len, &in, vq_log, &log,
    likely(mergeable) ? UIO_MAXIOV : 1);

    UIO_MAXIOV was still used which is wrong since we could have batched
    used in vq->heads, this will cause OOB if the next buffer needs more
    than 960 (1024 (UIO_MAXIOV) - 64 (VHOST_NET_BATCH)) heads after we've
    batched 64 (VHOST_NET_BATCH) heads:
    Acked-by: Stefan Hajnoczi

    =============================================================================
    BUG kmalloc-8k (Tainted: G B ): Redzone overwritten
    -----------------------------------------------------------------------------

    INFO: 0x00000000fd93b7a2-0x00000000f0713384. First byte 0xa9 instead of 0xcc
    INFO: Allocated in alloc_pd+0x22/0x60 age=3933677 cpu=2 pid=2674
    kmem_cache_alloc_trace+0xbb/0x140
    alloc_pd+0x22/0x60
    gen8_ppgtt_create+0x11d/0x5f0
    i915_ppgtt_create+0x16/0x80
    i915_gem_create_context+0x248/0x390
    i915_gem_context_create_ioctl+0x4b/0xe0
    drm_ioctl_kernel+0xa5/0xf0
    drm_ioctl+0x2ed/0x3a0
    do_vfs_ioctl+0x9f/0x620
    ksys_ioctl+0x6b/0x80
    __x64_sys_ioctl+0x11/0x20
    do_syscall_64+0x43/0xf0
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
    INFO: Slab 0x00000000d13e87af objects=3 used=3 fp=0x (null) flags=0x200000000010201
    INFO: Object 0x0000000003278802 @offset=17064 fp=0x00000000e2e6652b

    Fixing this by allocating UIO_MAXIOV + VHOST_NET_BATCH iovs for
    vhost-net. This is done through set the limitation through
    vhost_dev_init(), then set_owner can allocate the number of iov in a
    per device manner.

    This fixes CVE-2018-16880.

    Fixes: e2b3b35eb989 ("vhost_net: batch used ring update in rx")
    Signed-off-by: Jason Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jason Wang
     

31 Jan, 2019

1 commit

  • [ Upstream commit cc5e710759470bc7f3c61d11fd54586f15fdbdf4 ]

    Vhost dirty page logging API is designed to sync through GPA. But we
    try to log GIOVA when device IOTLB is enabled. This is wrong and may
    lead to missing data after migration.

    To solve this issue, when logging with device IOTLB enabled, we will:

    1) reuse the device IOTLB translation result of GIOVA->HVA mapping to
    get HVA, for writable descriptor, get HVA through iovec. For used
    ring update, translate its GIOVA to HVA
    2) traverse the GPA->HVA mapping to get the possible GPA and log
    through GPA. Pay attention this reverse mapping is not guaranteed
    to be unique, so we should log each possible GPA in this case.

    This fix the failure of scp to guest during migration. In -next, we
    will probably support passing GIOVA->GPA instead of GIOVA->HVA.

    Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API")
    Reported-by: Jintack Lim
    Cc: Jintack Lim
    Signed-off-by: Jason Wang
    Acked-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jason Wang
     

10 Jan, 2019

1 commit

  • [ Upstream commit 841df922417eb82c835e93d4b93eb6a68c99d599 ]

    We miss a write barrier that guarantees used idx is updated and seen
    before log. This will let userspace sync and copy used ring before
    used idx is update. Fix this by adding a barrier before log_write().

    Fixes: 8dd014adfea6f ("vhost-net: mergeable buffers support")
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Jason Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jason Wang
     

21 Dec, 2018

1 commit

  • [ Upstream commit c38f57da428b033f2721b611d84b1f40bde674a8 ]

    If a local process has closed a connected socket and hasn't received a
    RST packet yet, then the socket remains in the table until a timeout
    expires.

    When a vhost_vsock instance is released with the timeout still pending,
    the socket is never freed because vhost_vsock has already set the
    SOCK_DONE flag.

    Check if the close timer is pending and let it close the socket. This
    prevents the race which can leak sockets.

    Reported-by: Maximilian Riemensberger
    Cc: Graham Whaley
    Signed-off-by: Stefan Hajnoczi
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Sasha Levin

    Stefan Hajnoczi
     

13 Dec, 2018

1 commit

  • commit 834e772c8db0c6a275d75315d90aba4ebbb1e249 upstream.

    If the network stack calls .send_pkt()/.cancel_pkt() during .release(),
    a struct vhost_vsock use-after-free is possible. This occurs because
    .release() does not wait for other CPUs to stop using struct
    vhost_vsock.

    Switch to an RCU-enabled hashtable (indexed by guest CID) so that
    .release() can wait for other CPUs by calling synchronize_rcu(). This
    also eliminates vhost_vsock_lock acquisition in the data path so it
    could have a positive effect on performance.

    This is CVE-2018-14625 "kernel: use-after-free Read in vhost_transport_send_pkt".

    Cc: stable@vger.kernel.org
    Reported-and-tested-by: syzbot+bd391451452fb0b93039@syzkaller.appspotmail.com
    Reported-by: syzbot+e3e074963495f92a89ed@syzkaller.appspotmail.com
    Reported-by: syzbot+d5a0a170c5069658b141@syzkaller.appspotmail.com
    Signed-off-by: Stefan Hajnoczi
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Jason Wang
    Signed-off-by: Greg Kroah-Hartman

    Stefan Hajnoczi
     

21 Nov, 2018

1 commit

  • commit 4542d623c7134bc1738f8a68ccb6dd546f1c264f upstream.

    Commands with protection information included were not truncating the
    protection iov_iter to the number of protection bytes in the command.
    This resulted in vhost_scsi mis-calculating the size of the protection
    SGL in vhost_scsi_calc_sgls(), and including both the protection and
    data SG entries in the protection SGL.

    Fixes: 09b13fa8c1a1 ("vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq")
    Signed-off-by: Greg Edwards
    Signed-off-by: Michael S. Tsirkin
    Fixes: 09b13fa8c1a1093e9458549ac8bb203a7c65c62a
    Cc: stable@vger.kernel.org
    Reviewed-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Greg Edwards
     

04 Nov, 2018

1 commit

  • [ Upstream commit ff002269a4ee9c769dbf9365acef633ebcbd6cbe ]

    The idx in vhost_vring_ioctl() was controlled by userspace, hence a
    potential exploitation of the Spectre variant 1 vulnerability.

    Fixing this by sanitizing idx before using it to index d->vqs.

    Cc: Michael S. Tsirkin
    Cc: Josh Poimboeuf
    Cc: Andrea Arcangeli
    Signed-off-by: Jason Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jason Wang
     

28 Aug, 2018

1 commit

  • Pull networking fixes from David Miller:

    1) ICE, E1000, IGB, IXGBE, and I40E bug fixes from the Intel folks.

    2) Better fix for AB-BA deadlock in packet scheduler code, from Cong
    Wang.

    3) bpf sockmap fixes (zero sized key handling, etc.) from Daniel
    Borkmann.

    4) Send zero IPID in TCP resets and SYN-RECV state ACKs, to prevent
    attackers using it as a side-channel. From Eric Dumazet.

    5) Memory leak in mediatek bluetooth driver, from Gustavo A. R. Silva.

    6) Hook up rt->dst.input of ipv6 anycast routes properly, from Hangbin
    Liu.

    7) hns and hns3 bug fixes from Huazhong Tan.

    8) Fix RIF leak in mlxsw driver, from Ido Schimmel.

    9) iova range check fix in vhost, from Jason Wang.

    10) Fix hang in do_tcp_sendpages() with tls, from John Fastabend.

    11) More r8152 chips need to disable RX aggregation, from Kai-Heng Feng.

    12) Memory exposure in TCA_U32_SEL handling, from Kees Cook.

    13) TCP BBR congestion control fixes from Kevin Yang.

    14) hv_netvsc, ignore non-PCI devices, from Stephen Hemminger.

    15) qed driver fixes from Tomer Tayar.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (77 commits)
    net: sched: Fix memory exposure from short TCA_U32_SEL
    qed: fix spelling mistake "comparsion" -> "comparison"
    vhost: correctly check the iova range when waking virtqueue
    qlge: Fix netdev features configuration.
    net: macb: do not disable MDIO bus at open/close time
    Revert "net: stmmac: fix build failure due to missing COMMON_CLK dependency"
    net: macb: Fix regression breaking non-MDIO fixed-link PHYs
    mlxsw: spectrum_switchdev: Do not leak RIFs when removing bridge
    i40e: fix condition of WARN_ONCE for stat strings
    i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled
    ixgbe: fix driver behaviour after issuing VFLR
    ixgbe: Prevent unsupported configurations with XDP
    ixgbe: Replace GFP_ATOMIC with GFP_KERNEL
    igb: Replace mdelay() with msleep() in igb_integrated_phy_loopback()
    igb: Replace GFP_ATOMIC with GFP_KERNEL in igb_sw_init()
    igb: Use an advanced ctx descriptor for launchtime
    e1000: ensure to free old tx/rx rings in set_ringparam()
    e1000: check on netif_running() before calling e1000_up()
    ixgb: use dma_zalloc_coherent instead of allocator/memset
    ice: Trivial formatting fixes
    ...

    Linus Torvalds
     

26 Aug, 2018

1 commit

  • We don't wakeup the virtqueue if the first byte of pending iova range
    is the last byte of the range we just got updated. This will lead a
    virtqueue to wait for IOTLB updating forever. Fixing by correct the
    check and wake up the virtqueue in this case.

    Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API")
    Reported-by: Peter Xu
    Signed-off-by: Jason Wang
    Reviewed-by: Peter Xu
    Tested-by: Peter Xu
    Acked-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Jason Wang
     

24 Aug, 2018

1 commit

  • Pull virtio updates from Michael Tsirkin:
    "virtio, vhost: fixes, tweaks

    No new features but a bunch of tweaks such as switching balloon from
    oom notifier to shrinker"

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
    vhost/scsi: increase VHOST_SCSI_PREALLOC_PROT_SGLS to 2048
    vhost: allow vhost-scsi driver to be built-in
    virtio: pci-legacy: Validate queue pfn
    virtio: mmio-v1: Validate queue PFN
    virtio_balloon: replace oom notifier with shrinker
    virtio-balloon: kzalloc the vb struct
    virtio-balloon: remove BUG() in init_vqs

    Linus Torvalds
     

22 Aug, 2018

2 commits

  • The current value of VHOST_SCSI_PREALLOC_PROT_SGLS is too small to
    accommodate larger I/Os, e.g. 16-32 MiB, when the VIRTIO_SCSI_F_T10_PI
    feature bit is negotiated and the backing store supports T10 PI.

    vhost-scsi rejects the command with errors like:

    [ 59.581317] vhost_scsi_calc_sgls: requested sgl_count: 1820 exceeds pre-allocated max_sgls: 512

    Signed-off-by: Greg Edwards
    Signed-off-by: Michael S. Tsirkin

    Greg Edwards
     
  • It's useful to allow vhost-scsi to be built-in when testing vhost in L1
    + L2 VMs and booting L1 VM with QEMU '-kernel' option.

    Signed-off-by: Greg Edwards
    Signed-off-by: Michael S. Tsirkin

    Greg Edwards
     

16 Aug, 2018

1 commit

  • Pull SCSI updates from James Bottomley:
    "This is mostly updates to the usual drivers: mpt3sas, lpfc, qla2xxx,
    hisi_sas, smartpqi, megaraid_sas, arcmsr.

    In addition, with the continuing absence of Nic we have target updates
    for tcmu and target core (all with reviews and acks).

    The biggest observable change is going to be that we're (again) trying
    to switch to mulitqueue as the default (a user can still override the
    setting on the kernel command line).

    Other major core stuff is the removal of the remaining Microchannel
    drivers, an update of the internal timers and some reworks of
    completion and result handling"

    * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits)
    scsi: core: use blk_mq_run_hw_queues in scsi_kick_queue
    scsi: ufs: remove unnecessary query(DM) UPIU trace
    scsi: qla2xxx: Fix issue reported by static checker for qla2x00_els_dcmd2_sp_done()
    scsi: aacraid: Spelling fix in comment
    scsi: mpt3sas: Fix calltrace observed while running IO & reset
    scsi: aic94xx: fix an error code in aic94xx_init()
    scsi: st: remove redundant pointer STbuffer
    scsi: qla2xxx: Update driver version to 10.00.00.08-k
    scsi: qla2xxx: Migrate NVME N2N handling into state machine
    scsi: qla2xxx: Save frame payload size from ICB
    scsi: qla2xxx: Fix stalled relogin
    scsi: qla2xxx: Fix race between switch cmd completion and timeout
    scsi: qla2xxx: Fix Management Server NPort handle reservation logic
    scsi: qla2xxx: Flush mailbox commands on chip reset
    scsi: qla2xxx: Fix unintended Logout
    scsi: qla2xxx: Fix session state stuck in Get Port DB
    scsi: qla2xxx: Fix redundant fc_rport registration
    scsi: qla2xxx: Silent erroneous message
    scsi: qla2xxx: Prevent sysfs access when chip is down
    scsi: qla2xxx: Add longer window for chip reset
    ...

    Linus Torvalds
     

10 Aug, 2018

1 commit


09 Aug, 2018

1 commit

  • We need to reset metadata cache during new IOTLB initialization,
    otherwise the stale pointers to previous IOTLB may be still accessed
    which will lead a use after free.

    Reported-by: syzbot+c51e6736a1bf614b3272@syzkaller.appspotmail.com
    Fixes: f88949138058 ("vhost: introduce O(1) vq metadata cache")
    Signed-off-by: Jason Wang
    Signed-off-by: David S. Miller

    Jason Wang
     

07 Aug, 2018

1 commit

  • We use to have message like:

    struct vhost_msg {
    int type;
    union {
    struct vhost_iotlb_msg iotlb;
    __u8 padding[64];
    };
    };

    Unfortunately, there will be a hole of 32bit in 64bit machine because
    of the alignment. This leads a different formats between 32bit API and
    64bit API. What's more it will break 32bit program running on 64bit
    machine.

    So fixing this by introducing a new message type with an explicit
    32bit reserved field after type like:

    struct vhost_msg_v2 {
    __u32 type;
    __u32 reserved;
    union {
    struct vhost_iotlb_msg iotlb;
    __u8 padding[64];
    };
    };

    We will have a consistent ABI after switching to use this. To enable
    this capability, introduce a new ioctl (VHOST_SET_BAKCEND_FEATURE) for
    userspace to enable this feature (VHOST_BACKEND_F_IOTLB_V2).

    Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API")
    Signed-off-by: Jason Wang
    Signed-off-by: David S. Miller

    Jason Wang
     

03 Aug, 2018

2 commits

  • This converts drivers that were only calling transport_deregister_session
    to use target_remove_session. The calling of
    transport_deregister_session_configfs via target_remove_session for these
    types of drivers is ok, because they were not exporting info from fields
    like sess_acl_list, sess->se_tpg and sess->fabric_sess_ptr from configfs
    accessible functions, so they will see no difference.

    Signed-off-by: Mike Christie
    Reviewed-by: Bart Van Assche
    Reviewed-by: Christoph Hellwig
    Cc: Felipe Balbi
    Cc: Sebastian Andrzej Siewior
    Cc: Andrzej Pietrasiewicz
    Cc: Michael S. Tsirkin
    Cc: Juergen Gross
    Signed-off-by: Martin K. Petersen

    Mike Christie
     
  • Rename target_alloc_session to target_setup_session to avoid confusion with
    the other transport session allocation function that only allocates the
    session and because the target_alloc_session does so much more. It
    allocates the session, sets up the nacl and registers the session.

    The next patch will then add a remove function to match the setup in this
    one, so it should make sense for all drivers, except iscsi, to just call
    those 2 functions to setup and remove a session.

    iscsi will continue to be the odd driver.

    Signed-off-by: Mike Christie
    Reviewed-by: Bart Van Assche
    Reviewed-by: Christoph Hellwig
    Cc: Chris Boot
    Cc: Bryant G. Ly
    Cc: Michael Cyr
    Cc:
    Cc: Johannes Thumshirn
    Cc: Felipe Balbi
    Cc: Sebastian Andrzej Siewior
    Cc: Andrzej Pietrasiewicz
    Cc: Michael S. Tsirkin
    Cc: Juergen Gross
    Signed-off-by: Martin K. Petersen

    Mike Christie
     

23 Jul, 2018

9 commits


04 Jul, 2018

4 commits

  • We may run out of avail rx ring descriptor under heavy load but busypoll
    did not detect it so busypoll may have exited prematurely. Avoid this by
    checking rx ring full during busypoll.

    Signed-off-by: Toshiaki Makita
    Acked-by: Jason Wang
    Signed-off-by: David S. Miller

    Toshiaki Makita
     
  • We may run handle_rx() while rx work is queued. For example a packet can
    push the rx work during the window before handle_rx calls
    vhost_net_disable_vq().
    In that case busypoll immediately exits due to vhost_has_work()
    condition and enables vq again. This can lead to another unnecessary rx
    wake-ups, so poll rx work instead of enabling the vq.

    Signed-off-by: Toshiaki Makita
    Acked-by: Jason Wang
    Signed-off-by: David S. Miller

    Toshiaki Makita
     
  • Under heavy load vhost busypoll may run without suppressing
    notification. For example tx zerocopy callback can push tx work while
    handle_tx() is running, then busyloop exits due to vhost_has_work()
    condition and enables notification but immediately reenters handle_tx()
    because the pushed work was tx. In this case handle_tx() tries to
    disable notification again, but when using event_idx it by design
    cannot. Then busyloop will run without suppressing notification.
    Another example is the case where handle_tx() tries to enable
    notification but avail idx is advanced so disables it again. This case
    also leads to the same situation with event_idx.

    The problem is that once we enter this situation busyloop does not work
    under heavy load for considerable amount of time, because notification
    is likely to happen during busyloop and handle_tx() immediately enables
    notification after notification happens. Specifically busyloop detects
    notification by vhost_has_work() and then handle_tx() calls
    vhost_enable_notify(). Because the detected work was the tx work, it
    enters handle_tx(), and enters busyloop without suppression again.
    This is likely to be repeated, so with event_idx we are almost not able
    to suppress notification in this case.

    To fix this, poll the work instead of enabling notification when
    busypoll is interrupted by something. IMHO vhost_has_work() is kind of
    interruption rather than a signal to completely cancel the busypoll, so
    let's run busypoll after the necessary work is done.

    Signed-off-by: Toshiaki Makita
    Acked-by: Jason Wang
    Signed-off-by: David S. Miller

    Toshiaki Makita
     
  • So we can easily see which variable is for which, tx or rx.

    Signed-off-by: Toshiaki Makita
    Acked-by: Jason Wang
    Signed-off-by: David S. Miller

    Toshiaki Makita
     

03 Jul, 2018

1 commit

  • Since most target drivers do not use the second fabric_make_tpg() argument
    ("group") and since it is trivial to derive the group pointer from the wwn
    pointer, do not pass the group pointer to fabric_make_tpg().

    Signed-off-by: Bart Van Assche
    Reviewed-by: Mike Christie
    Cc: Felipe Balbi
    Cc: Hannes Reinecke
    Cc: Christoph Hellwig
    Signed-off-by: Martin K. Petersen

    Bart Van Assche
     

23 Jun, 2018

1 commit

  • Sock will be NULL if we pass -1 to vhost_net_set_backend(), but when
    we meet errors during ubuf allocation, the code does not check for
    NULL before calling sockfd_put(), this will lead NULL
    dereferencing. Fixing by checking sock pointer before.

    Fixes: bab632d69ee4 ("vhost: vhost TX zero-copy support")
    Reported-by: Dan Carpenter
    Signed-off-by: Jason Wang
    Signed-off-by: David S. Miller

    Jason Wang
     

20 Jun, 2018

2 commits

  • The sbitmap and the percpu_ida perform essentially the same task,
    allocating tags for commands. The sbitmap outperforms the percpu_ida as
    documented here: https://lkml.org/lkml/2014/4/22/553

    The sbitmap interface is a little harder to use, but being able to remove
    the percpu_ida code and getting better performance justifies the additional
    complexity.

    Signed-off-by: Matthew Wilcox
    Acked-by: Felipe Balbi # f_tcm
    Reviewed-by: Jens Axboe
    Signed-off-by: Martin K. Petersen

    Matthew Wilcox
     
  • Introduce target_free_tag() and convert all drivers to use it.

    Signed-off-by: Matthew Wilcox
    Reviewed-by: Jens Axboe
    Signed-off-by: Martin K. Petersen

    Matthew Wilcox
     

16 Jun, 2018

1 commit

  • Pull virtio updates from Michael Tsirkin:
    "virtio, vhost: features, fixes

    - PCI virtual function support for virtio

    - DMA barriers for virtio strong barriers

    - bugfixes"

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
    virtio: update the comments for transport features
    virtio_pci: support enabling VFs
    vhost: fix info leak due to uninitialized memory
    virtio_ring: switch to dma_XX barriers for rpmsg

    Linus Torvalds
     

13 Jun, 2018

1 commit

  • The kzalloc() function has a 2-factor argument form, kcalloc(). This
    patch replaces cases of:

    kzalloc(a * b, gfp)

    with:
    kcalloc(a * b, gfp)

    as well as handling cases of:

    kzalloc(a * b * c, gfp)

    with:

    kzalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kzalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kzalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kzalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kzalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kzalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kzalloc
    + kcalloc
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kzalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kzalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kzalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kzalloc(sizeof(THING) * C2, ...)
    |
    kzalloc(sizeof(TYPE) * C2, ...)
    |
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(C1 * C2, ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook