02 Jun, 2011

12 commits


01 Jun, 2011

5 commits


30 May, 2011

14 commits

  • Ask for delayed callbacks on TX ring full, to give the
    other side more of a chance to make progress.

    Signed-off-by: Michael S. Tsirkin
    Acked-by: David S. Miller
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     
  • Add an API that tells the other side that callbacks
    should be delayed until a lot of work has been done.
    Implement using the new event_idx feature.

    Note: it might seem advantageous to let the drivers
    ask for a callback after a specific capacity has
    been reached. However, as a single head can
    free many entries in the descriptor table,
    we don't really have a clue about capacity
    until get_buf is called. The API is the simplest
    to implement at the moment, we'll see what kind of
    hints drivers can pass when there's more than one
    user of the feature.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     
  • Support the new event index feature. When acked,
    utilize it to reduce the # of interrupts sent to the guest.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     
  • Support for the new event idx feature:
    1. When enabling interrupts, publish the current avail index
    value to the host to get interrupts on the next update.
    2. Use the new avail_event feature to reduce the number
    of exits from the guest.

    Simple test with the simulator:

    [virtio]# time ./virtio_test
    spurious wakeus: 0x7

    real 0m0.169s
    user 0m0.140s
    sys 0m0.019s
    [virtio]# time ./virtio_test --no-event-idx
    spurious wakeus: 0x11

    real 0m0.649s
    user 0m0.295s
    sys 0m0.335s

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     
  • The virtio balloon driver has a VIRTIO_BALLOON_F_MUST_TELL_HOST
    feature bit. Whenever the bit is set, the guest kernel must
    always tell the host before we free pages back to the allocator.
    Without this feature, we might free a page (and have another
    user touch it) while the hypervisor is unprepared for it.

    But, if the bit is _not_ set, we are under no obligation to
    reverse the order; we're under no obligation to do _anything_.
    As of now, qemu-kvm defines the bit, but doesn't set it.

    This patch makes the "tell host first" logic the only case. This
    should make everybody happy, and reduce the amount of untested or
    untestable code in the kernel.

    This _also_ means that we don't have to preserve a pfn list
    after the pages are freed, which should let us get rid of some
    temporary storage (vb->pfns) eventually.

    Signed-off-by: Dave Hansen
    Signed-off-by: Rusty Russell

    Dave Hansen
     
  • That's already been done by the virtio infrastructure before the probe
    function is called.

    Reported-by: alexey.kardashevskiy@au1.ibm.com
    Acked-by: Amit Shah
    Tested-by: Amit Shah
    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • It is easier to figure out the context by reading SCSI_SENSE_BUFFERSIZE
    instead of plain '96'.

    Signed-off-by: Liu Yuan
    Signed-off-by: Rusty Russell

    Liu Yuan
     
  • Wire up the virtio_driver config_changed method to get notified about
    config changes raised by the host. For now we just re-read the device
    size to support online resizing of devices, but once we add more
    attributes that might be changeable they could be added as well.

    Note that the config_changed method is called from irq context, so
    we'll have to use the workqueue infrastructure to provide us a proper
    user context for our changes.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Rusty Russell

    Christoph Hellwig
     
  • We had a few drivers move from arch/arm into drivers/gpio, but they
    don't actually compile without the ARM platform headers etc. As a
    result they were messing up allyesconfig on x86.

    Make them depend on ARM.

    Reported-by: Ingo Molnar
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86: (43 commits)
    acer-wmi: support integer return type from WMI methods
    msi-laptop: fix section mismatch in reference from the function load_scm_model_init
    acer-wmi: support to set communication device state by new wmid method
    acer-wmi: allow 64-bits return buffer from WMI methods
    acer-wmi: check the existence of internal 3G device when set capability
    platform/x86:delete two unused variables
    support wlan hotkey on Acer Travelmate 5735Z
    platform-x86: intel_mid_thermal: Fix memory leak
    platform/x86: Fix Makefile for intel_mid_powerbtn
    platform/x86: Simplify intel_mid_powerbtn
    acer-wmi: Delete out-of-date documentation
    acerhdf: Clean up includes
    acerhdf: Drop pointless dependency on THERMAL_HWMON
    acer-wmi: Update MAINTAINERS
    wmi: Orphan ACPI-WMI driver
    tc1100-wmi: Orphan driver
    acer-wmi: does not allow negative number set to initial device state
    platform/oaktrail: ACPI EC Extra driver for Oaktrail
    thinkpad_acpi: Convert printks to pr_
    thinkpad_acpi: Correct !CONFIG_THINKPAD_ACPI_VIDEO warning
    ...

    Linus Torvalds
     
  • Matthew Garrett
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
    dm kcopyd: return client directly and not through a pointer
    dm kcopyd: reserve fewer pages
    dm io: use fixed initial mempool size
    dm kcopyd: alloc pages from the main page allocator
    dm kcopyd: add gfp parm to alloc_pl
    dm kcopyd: remove superfluous page allocation spinlock
    dm kcopyd: preallocate sub jobs to avoid deadlock
    dm kcopyd: avoid pointless job splitting
    dm mpath: do not fail paths after integrity errors
    dm table: reject devices without request fns
    dm table: allow targets to support discards internally

    Linus Torvalds
     
  • * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
    ACPI EC: remove redundant code
    ACPI: Add D3 cold state
    ACPI: processor: fix processor_physically_present in UP kernel
    ACPI: Split out custom_method functionality into an own driver
    ACPI: Cleanup custom_method debug stuff
    ACPI EC: enable MSI workaround for Quanta laptops
    ACPICA: Update to version 20110413
    ACPICA: Execute an orphan _REG method under the EC device
    ACPICA: Move ACPI_NUM_PREDEFINED_REGIONS to a more appropriate place
    ACPICA: Update internal address SpaceID for DataTable regions
    ACPICA: Add more methods eligible for NULL package element removal
    ACPICA: Split all internal Global Lock functions to new file - evglock
    ACPI: EC: add another DMI check for ASUS hardware
    ACPI EC: remove dead code
    ACPICA: Fix code divergence of global lock handling
    ACPICA: Use acpi_os_create_lock interface
    ACPI: osl, add acpi_os_create_lock interface
    ACPI:Fix goto flows in thermal-sys

    Linus Torvalds
     
  • * 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
    x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param
    x86 idle: deprecate "no-hlt" cmdline param
    x86 idle APM: deprecate CONFIG_APM_CPU_IDLE
    x86 idle floppy: deprecate disable_hlt()
    x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it
    x86 idle: clarify AMD erratum 400 workaround
    idle governor: Avoid lock acquisition to read pm_qos before entering idle
    cpuidle: menu: fixed wrapping timers at 4.294 seconds

    Linus Torvalds
     

29 May, 2011

9 commits

  • Return client directly from dm_kcopyd_client_create, not through a
    parameter, making it consistent with dm_io_client_create.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Reserve just the minimum of pages needed to process one job.

    Because we allocate pages from page allocator, we don't need to reserve
    a large number of pages. The maximum job size is SUB_JOB_SIZE and we
    calculate the number of reserved pages based on this.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Replace the arbitrary calculation of an initial io struct mempool size
    with a constant.

    The code calculated the number of reserved structures based on the request
    size and used a "magic" multiplication constant of 4. This patch changes
    it to reserve a fixed number - itself still chosen quite arbitrarily.
    Further testing might show if there is a better number to choose.

    Note that if there is no memory pressure, we can still allocate an
    arbitrary number of "struct io" structures. One structure is enough to
    process the whole request.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • This patch changes dm-kcopyd so that it allocates pages from the main
    page allocator with __GFP_NOWARN | __GFP_NORETRY flags (so that it can
    fail in case of memory pressure). If the allocation fails, dm-kcopyd
    allocates pages from its own reserve.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Introduce a parameter for gfp flags to alloc_pl() for use in following
    patches.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Remove the spinlock protecting the pages allocation. The spinlock is only
    taken on initialization or from single-threaded workqueue. Therefore, the
    spinlock is useless.

    The spinlock is taken in kcopyd_get_pages and kcopyd_put_pages.

    kcopyd_get_pages is only called from run_pages_job, which is only
    called from process_jobs called from do_work.

    kcopyd_put_pages is called from client_alloc_pages (which is initialization
    function) or from run_complete_job. run_complete_job is only called from
    process_jobs called from do_work.

    Another spinlock, kc->job_lock is taken each time someone pushes or pops
    some work for the worker thread. Once we take kc->job_lock, we
    guarantee that any written memory is visible to the other CPUs.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • There's a possible theoretical deadlock in dm-kcopyd because multiple
    allocations from the same mempool are required to finish a request.
    Avoid this by preallocating sub jobs.

    There is a mempool of 512 entries. Each request requires up to 9
    entries from the mempool. If we have at least 57 concurrent requests
    running, the mempool may overflow and mempool allocations may start
    blocking until another entry is freed to the mempool. Because the same
    thread is used to free entries to the mempool and allocate entries from
    the mempool, this may result in a deadlock.

    This patch changes it so that one mempool entry contains all 9 "struct
    kcopyd_job" required to fulfill the whole request. The allocation is
    done only once in dm_kcopyd_copy and no further mempool allocations are
    done during request processing.

    If dm_kcopyd_copy is not run in the completion thread, this
    implementation is deadlock-free.

    MIN_JOBS needs reducing accordingly and we've chosen to reduce it
    further to 8.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Don't split SUB_JOB_SIZE jobs

    If the job size equals SUB_JOB_SIZE, there is no point in splitting it.
    Splitting it just unnecessarily wastes time, because the split job size
    is SUB_JOB_SIZE too.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Integrity errors need to be passed to the owner of the integrity
    metadata for processing. Consequently EILSEQ should be passed up the
    stack.

    Cc: stable@kernel.org
    Signed-off-by: Martin K. Petersen
    Acked-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Martin K. Petersen