13 May, 2019

2 commits


09 Apr, 2019

1 commit

  • vring_create_virtqueue() allows the caller to specify via the
    may_reduce_num parameter whether the vring code is allowed to
    allocate a smaller ring than specified.

    However, the split ring allocation code tries to allocate a
    smaller ring on allocation failure regardless of what the
    caller specified. This may cause trouble for e.g. virtio-pci
    in legacy mode, which does not support ring resizing. (The
    packed ring code does not resize in any case.)

    Let's fix this by bailing out immediately in the split ring code
    if the requested size cannot be allocated and may_reduce_num has
    not been specified.

    While at it, fix a typo in the usage instructions.

    Fixes: 2a2d1382fe9d ("virtio: Add improved queue allocation API")
    Cc: stable@vger.kernel.org # v4.6+
    Signed-off-by: Cornelia Huck
    Signed-off-by: Michael S. Tsirkin
    Reviewed-by: Halil Pasic
    Reviewed-by: Jens Freimann

    Cornelia Huck
     

08 Apr, 2019

1 commit

  • If the msix_affinity_masks is alloced failed, then we'll
    try to free some resources in vp_free_vectors() that may
    access it directly.

    We met the following stack in our production:
    [ 29.296767] BUG: unable to handle kernel NULL pointer dereference at (null)
    [ 29.311151] IP: [] vp_free_vectors+0x6a/0x150 [virtio_pci]
    [ 29.324787] PGD 0
    [ 29.333224] Oops: 0000 [#1] SMP
    [...]
    [ 29.425175] RIP: 0010:[] [] vp_free_vectors+0x6a/0x150 [virtio_pci]
    [ 29.441405] RSP: 0018:ffff9a55c2dcfa10 EFLAGS: 00010206
    [ 29.453491] RAX: 0000000000000000 RBX: ffff9a55c322c400 RCX: 0000000000000000
    [ 29.467488] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9a55c322c400
    [ 29.481461] RBP: ffff9a55c2dcfa20 R08: 0000000000000000 R09: ffffc1b6806ff020
    [ 29.495427] R10: 0000000000000e95 R11: 0000000000aaaaaa R12: 0000000000000000
    [ 29.509414] R13: 0000000000010000 R14: ffff9a55bd2d9e98 R15: ffff9a55c322c400
    [ 29.523407] FS: 00007fdcba69f8c0(0000) GS:ffff9a55c2840000(0000) knlGS:0000000000000000
    [ 29.538472] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 29.551621] CR2: 0000000000000000 CR3: 000000003ce52000 CR4: 00000000003607a0
    [ 29.565886] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 29.580055] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 29.594122] Call Trace:
    [ 29.603446] [] vp_request_msix_vectors+0xe2/0x260 [virtio_pci]
    [ 29.618017] [] vp_try_to_find_vqs+0x95/0x3b0 [virtio_pci]
    [ 29.632152] [] vp_find_vqs+0x37/0xb0 [virtio_pci]
    [ 29.645582] [] init_vq+0x153/0x260 [virtio_blk]
    [ 29.658831] [] virtblk_probe+0xe8/0x87f [virtio_blk]
    [...]

    Cc: Gonglei
    Signed-off-by: Longpeng
    Signed-off-by: Michael S. Tsirkin
    Reviewed-by: Gonglei

    Longpeng
     

07 Mar, 2019

4 commits

  • A virtio transport is free to implement some of the callbacks in
    virtio_config_ops in a matter that they cannot be called from
    atomic context (e.g. virtio-ccw, which maps a lot of the callbacks
    to channel I/O, which is an inherently asynchronous mechanism).
    This can be very surprising for developers using the much more
    common virtio-pci transport, just to find out that things break
    when used on s390.

    The documentation for virtio_config_ops now contains a comment
    explaining this, but it makes sense to add a might_sleep() annotation
    to various wrapper functions in the virtio core to avoid surprises
    later.

    Note that annotations are NOT added to two classes of calls:
    - direct calls from device drivers (all current callers should be
    fine, however)
    - calls which clearly won't be made from atomic context (such as
    those ultimately coming in via the driver core)

    Signed-off-by: Cornelia Huck
    Signed-off-by: Michael S. Tsirkin

    Cornelia Huck
     
  • We've changed to kzalloc the vb struct, so no need to 0-initialize
    this field one more time.

    Signed-off-by: Wei Wang
    Signed-off-by: Michael S. Tsirkin
    Reviewed-by: Cornelia Huck

    Wei Wang
     
  • There is no need to update the balloon actual register when there is no
    ballooning request. This patch avoids update_balloon_size when diff is 0.

    Signed-off-by: Wei Wang
    Reviewed-by: Cornelia Huck
    Reviewed-by: Halil Pasic
    Signed-off-by: Michael S. Tsirkin

    Wei Wang
     
  • This function returns the maximum segment size for a single
    dma transaction of a virtio device. The possible limit comes
    from the SWIOTLB implementation in the Linux kernel, that
    has an upper limit of (currently) 256kb of contiguous
    memory it can map. Other DMA-API implementations might also
    have limits.

    Use the new dma_max_mapping_size() function to determine the
    maximum mapping size when DMA-API is in use for virtio.

    Cc: stable@vger.kernel.org
    Reviewed-by: Konrad Rzeszutek Wilk
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Joerg Roedel
    Signed-off-by: Michael S. Tsirkin

    Joerg Roedel
     

06 Feb, 2019

1 commit


24 Jan, 2019

1 commit

  • This patch introduces the support for VIRTIO_F_ORDER_PLATFORM.
    If this feature is negotiated, the driver must use the barriers
    suitable for hardware devices. Otherwise, the device and driver
    are assumed to be implemented in software, that is they can be
    assumed to run on identical CPUs in an SMP configuration. Thus
    a weaker form of memory barriers is sufficient to yield better
    performance.

    It is recommended that an add-in card based PCI device offers
    this feature for portability. The device will fail to operate
    further or will operate in a slower emulation mode if this
    feature is offered but not accepted.

    Signed-off-by: Tiwei Bie
    Signed-off-by: Michael S. Tsirkin

    Tiwei Bie
     

15 Jan, 2019

3 commits

  • virtio-ccw has deadlock issues with reading the config space inside the
    interrupt context, so we tweak the virtballoon_changed implementation
    by moving the config read operations into the related workqueue contexts.
    The config_read_bitmap is used as a flag to the workqueue callbacks
    about the related config fields that need to be read.

    The cmd_id_received is also renamed to cmd_id_received_cache, and
    the value should be obtained via virtio_balloon_cmd_id_received.

    Reported-by: Christian Borntraeger
    Signed-off-by: Wei Wang
    Reviewed-by: Cornelia Huck
    Reviewed-by: Halil Pasic
    Signed-off-by: Michael S. Tsirkin
    Cc: stable@vger.kernel.org
    Fixes: 86a559787e6f ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
    Tested-by: Christian Borntraeger

    Wei Wang
     
  • Some vqs may not need to be allocated when their related feature bits
    are disabled. So callers may pass in such vqs with "names = NULL".
    Then we skip such vq allocations.

    Signed-off-by: Wei Wang
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Wei Wang
    Signed-off-by: Wei Wang
    Reviewed-by: Cornelia Huck
    Cc: stable@vger.kernel.org
    Fixes: 86a559787e6f ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")

    Wei Wang
     
  • When find_vqs, there will be no vq[i] allocation if its corresponding
    names[i] is NULL. For example, the caller may pass in names[i] (i=4)
    with names[2] being NULL because the related feature bit is turned off,
    so technically there are 3 queues on the device, and name[4] should
    correspond to the 3rd queue on the device.

    So we use queue_idx as the queue index, which is increased only when the
    queue exists.

    Signed-off-by: Wei Wang
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Wei Wang
    Signed-off-by: Wei Wang

    Wei Wang
     

03 Jan, 2019

1 commit

  • Pull virtio/vhost updates from Michael Tsirkin:
    "Features, fixes, cleanups:

    - discard in virtio blk

    - misc fixes and cleanups"

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
    vhost: correct the related warning message
    vhost: split structs into a separate header file
    virtio: remove deprecated VIRTIO_PCI_CONFIG()
    vhost/vsock: switch to a mutex for vhost_vsock_hash
    virtio_blk: add discard and write zeroes support

    Linus Torvalds
     

20 Dec, 2018

1 commit


27 Nov, 2018

11 commits


25 Oct, 2018

2 commits

  • The VIRTIO_BALLOON_F_PAGE_POISON feature bit is used to indicate if the
    guest is using page poisoning. Guest writes to the poison_val config
    field to tell host about the page poisoning value that is in use.

    Suggested-by: Michael S. Tsirkin
    Signed-off-by: Wei Wang
    Cc: Michael S. Tsirkin
    Cc: Michal Hocko
    Cc: Andrew Morton
    Signed-off-by: Michael S. Tsirkin

    Wei Wang
     
  • Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_HINT feature indicates the
    support of reporting hints of guest free pages to host via virtio-balloon.
    Currenlty, only free page blocks of MAX_ORDER - 1 are reported. They are
    obtained one by one from the mm free list via the regular allocation
    function.

    Host requests the guest to report free page hints by sending a new cmd id
    to the guest via the free_page_report_cmd_id configuration register. When
    the guest starts to report, it first sends a start cmd to host via the
    free page vq, which acks to host the cmd id received. When the guest
    finishes reporting free pages, a stop cmd is sent to host via the vq.
    Host may also send a stop cmd id to the guest to stop the reporting.

    VIRTIO_BALLOON_CMD_ID_STOP: Host sends this cmd to stop the guest
    reporting.
    VIRTIO_BALLOON_CMD_ID_DONE: Host sends this cmd to tell the guest that
    the reported pages are ready to be freed.

    Why does the guest free the reported pages when host tells it is ready to
    free?
    This is because freeing pages appears to be expensive for live migration.
    free_pages() dirties memory very quickly and makes the live migraion not
    converge in some cases. So it is good to delay the free_page operation
    when the migration is done, and host sends a command to guest about that.

    Why do we need the new VIRTIO_BALLOON_CMD_ID_DONE, instead of reusing
    VIRTIO_BALLOON_CMD_ID_STOP?
    This is because live migration is usually done in several rounds. At the
    end of each round, host needs to send a VIRTIO_BALLOON_CMD_ID_STOP cmd to
    the guest to stop (or say pause) the reporting. The guest resumes the
    reporting when it receives a new command id at the beginning of the next
    round. So we need a new cmd id to distinguish between "stop reporting" and
    "ready to free the reported pages".

    TODO:
    - Add a batch page allocation API to amortize the allocation overhead.

    Signed-off-by: Wei Wang
    Signed-off-by: Liang Li
    Cc: Michael S. Tsirkin
    Cc: Michal Hocko
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Signed-off-by: Michael S. Tsirkin

    Wei Wang
     

24 Aug, 2018

1 commit

  • Pull virtio updates from Michael Tsirkin:
    "virtio, vhost: fixes, tweaks

    No new features but a bunch of tweaks such as switching balloon from
    oom notifier to shrinker"

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
    vhost/scsi: increase VHOST_SCSI_PREALLOC_PROT_SGLS to 2048
    vhost: allow vhost-scsi driver to be built-in
    virtio: pci-legacy: Validate queue pfn
    virtio: mmio-v1: Validate queue PFN
    virtio_balloon: replace oom notifier with shrinker
    virtio-balloon: kzalloc the vb struct
    virtio-balloon: remove BUG() in init_vqs

    Linus Torvalds
     

22 Aug, 2018

5 commits

  • Legacy PCI over virtio uses a 32bit PFN for the queue. If the
    queue pfn is too large to fit in 32bits, which we could hit on
    arm64 systems with 52bit physical addresses (even with 64K page
    size), we simply miss out a proper link to the other side of
    the queue.

    Add a check to validate the PFN, rather than silently breaking
    the devices.

    Cc: "Michael S. Tsirkin"
    Cc: Jason Wang
    Cc: Marc Zyngier
    Cc: Christoffer Dall
    Cc: Peter Maydel
    Cc: Jean-Philippe Brucker
    Signed-off-by: Suzuki K Poulose
    Signed-off-by: Michael S. Tsirkin

    Suzuki K Poulose
     
  • virtio-mmio with virtio-v1 uses a 32bit PFN for the queue.
    If the queue pfn is too large to fit in 32bits, which
    we could hit on arm64 systems with 52bit physical addresses
    (even with 64K page size), we simply miss out a proper link
    to the other side of the queue.

    Add a check to validate the PFN, rather than silently breaking
    the devices.

    Cc: "Michael S. Tsirkin"
    Cc: Jason Wang
    Cc: Marc Zyngier
    Cc: Christoffer Dall
    Cc: Peter Maydel
    Cc: Jean-Philippe Brucker
    Signed-off-by: Suzuki K Poulose
    Signed-off-by: Michael S. Tsirkin

    Suzuki K Poulose
     
  • The OOM notifier is getting deprecated to use for the reasons:
    - As a callout from the oom context, it is too subtle and easy to
    generate bugs and corner cases which are hard to track;
    - It is called too late (after the reclaiming has been performed).
    Drivers with large amuont of reclaimable memory is expected to
    release them at an early stage of memory pressure;
    - The notifier callback isn't aware of oom contrains;
    Link: https://lkml.org/lkml/2018/7/12/314

    This patch replaces the virtio-balloon oom notifier with a shrinker
    to release balloon pages on memory pressure. The balloon pages are
    given back to mm adaptively by returning the number of pages that the
    reclaimer is asking for (i.e. sc->nr_to_scan).

    Currently the max possible value of sc->nr_to_scan passed to the balloon
    shrinker is SHRINK_BATCH, which is 128. This is smaller than the
    limitation that only VIRTIO_BALLOON_ARRAY_PFNS_MAX (256) pages can be
    returned via one invocation of leak_balloon. But this patch still
    considers the case that SHRINK_BATCH or shrinker->batch could be changed
    to a value larger than VIRTIO_BALLOON_ARRAY_PFNS_MAX, which will need to
    do multiple invocations of leak_balloon.

    Historically, the feature VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been used
    to release balloon pages on OOM. We continue to use this feature bit for
    the shrinker, so the shrinker is only registered when this feature bit
    has been negotiated with host.

    Signed-off-by: Wei Wang
    Cc: Michael S. Tsirkin
    Cc: Michal Hocko
    Cc: Andrew Morton
    Cc: Tetsuo Handa
    Signed-off-by: Michael S. Tsirkin

    Wei Wang
     
  • Zero all the vb fields at alloaction, so that we don't need to
    zero-initialize each field one by one later.

    Signed-off-by: Wei Wang
    Cc: Michael S. Tsirkin
    Cc: Tetsuo Handa
    Signed-off-by: Michael S. Tsirkin

    Wei Wang
     
  • It's a bit overkill to use BUG when failing to add an entry to the
    stats_vq in init_vqs. So remove it and just return the error to the
    caller to bail out nicely.

    Signed-off-by: Wei Wang
    Cc: Michael S. Tsirkin
    Signed-off-by: Michael S. Tsirkin

    Wei Wang
     

12 Aug, 2018

1 commit

  • Make vp_set_vq_affinity() take a cpumask instead of taking a single CPU.

    If there are fewer queues than cores, queue affinity should be able to
    map to multiple cores.

    Link: https://patchwork.ozlabs.org/patch/948149/
    Suggested-by: Willem de Bruijn
    Signed-off-by: Caleb Raitto
    Acked-by: Gonglei
    Signed-off-by: David S. Miller

    Caleb Raitto
     

30 Jul, 2018

1 commit

  • Kernel panic when with high memory pressure, calltrace looks like,

    PID: 21439 TASK: ffff881be3afedd0 CPU: 16 COMMAND: "java"
    #0 [ffff881ec7ed7630] machine_kexec at ffffffff81059beb
    #1 [ffff881ec7ed7690] __crash_kexec at ffffffff81105942
    #2 [ffff881ec7ed7760] crash_kexec at ffffffff81105a30
    #3 [ffff881ec7ed7778] oops_end at ffffffff816902c8
    #4 [ffff881ec7ed77a0] no_context at ffffffff8167ff46
    #5 [ffff881ec7ed77f0] __bad_area_nosemaphore at ffffffff8167ffdc
    #6 [ffff881ec7ed7838] __node_set at ffffffff81680300
    #7 [ffff881ec7ed7860] __do_page_fault at ffffffff8169320f
    #8 [ffff881ec7ed78c0] do_page_fault at ffffffff816932b5
    #9 [ffff881ec7ed78f0] page_fault at ffffffff8168f4c8
    [exception RIP: _raw_spin_lock_irqsave+47]
    RIP: ffffffff8168edef RSP: ffff881ec7ed79a8 RFLAGS: 00010046
    RAX: 0000000000000246 RBX: ffffea0019740d00 RCX: ffff881ec7ed7fd8
    RDX: 0000000000020000 RSI: 0000000000000016 RDI: 0000000000000008
    RBP: ffff881ec7ed79a8 R8: 0000000000000246 R9: 000000000001a098
    R10: ffff88107ffda000 R11: 0000000000000000 R12: 0000000000000000
    R13: 0000000000000008 R14: ffff881ec7ed7a80 R15: ffff881be3afedd0
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018

    It happens in the pagefault and results in double pagefault
    during compacting pages when memory allocation fails.

    Analysed the vmcore, the page leads to second pagefault is corrupted
    with _mapcount=-256, but private=0.

    It's caused by the race between migration and ballooning, and lock
    missing in virtballoon_migratepage() of virtio_balloon driver.
    This patch fix the bug.

    Fixes: e22504296d4f64f ("virtio_balloon: introduce migration primitives to balloon pages")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jiang Biao
    Signed-off-by: Huang Chong
    Signed-off-by: Michael S. Tsirkin

    Jiang Biao
     

16 Jun, 2018

1 commit

  • Pull virtio updates from Michael Tsirkin:
    "virtio, vhost: features, fixes

    - PCI virtual function support for virtio

    - DMA barriers for virtio strong barriers

    - bugfixes"

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
    virtio: update the comments for transport features
    virtio_pci: support enabling VFs
    vhost: fix info leak due to uninitialized memory
    virtio_ring: switch to dma_XX barriers for rpmsg

    Linus Torvalds
     

13 Jun, 2018

2 commits

  • The kzalloc() function has a 2-factor argument form, kcalloc(). This
    patch replaces cases of:

    kzalloc(a * b, gfp)

    with:
    kcalloc(a * b, gfp)

    as well as handling cases of:

    kzalloc(a * b * c, gfp)

    with:

    kzalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kzalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kzalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kzalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kzalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kzalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kzalloc
    + kcalloc
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kzalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kzalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kzalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kzalloc(sizeof(THING) * C2, ...)
    |
    kzalloc(sizeof(TYPE) * C2, ...)
    |
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(C1 * C2, ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     
  • The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
    patch replaces cases of:

    kmalloc(a * b, gfp)

    with:
    kmalloc_array(a * b, gfp)

    as well as handling cases of:

    kmalloc(a * b * c, gfp)

    with:

    kmalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kmalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kmalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The tools/ directory was manually excluded, since it has its own
    implementation of kmalloc().

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kmalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kmalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kmalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kmalloc
    + kmalloc_array
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kmalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kmalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kmalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kmalloc(sizeof(THING) * C2, ...)
    |
    kmalloc(sizeof(TYPE) * C2, ...)
    |
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(C1 * C2, ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

12 Jun, 2018

1 commit

  • There is a new feature bit allocated in virtio spec to
    support SR-IOV (Single Root I/O Virtualization):

    https://github.com/oasis-tcs/virtio-spec/issues/11

    This patch enables the support for this feature bit in
    virtio driver.

    Signed-off-by: Tiwei Bie
    Signed-off-by: Michael S. Tsirkin

    Tiwei Bie