20 Jun, 2008

1 commit

  • I am able to reproduce the oops reported by Simon in __switch_to() with
    lguest.

    My debug showed that there is at least one lguest specific
    issue (which should be present in 2.6.25 and before aswell) and it got
    exposed with a kernel oops with the recent fpu dynamic allocation patches.

    In addition to the previous possible scenario (with fpu_counter), in the
    presence of lguest, it is possible that the cpu's TS bit it still set and the
    lguest launcher task's thread_info has TS_USEDFPU still set.

    This is because of the way the lguest launcher handling the guest's TS bit.
    (look at lguest_set_ts() in lguest_arch_run_guest()). This can result
    in a DNA fault while doing unlazy_fpu() in __switch_to(). This will
    end up causing a DNA fault in the context of new process thats
    getting context switched in (as opossed to handling DNA fault in the context
    of lguest launcher/helper process).

    This is wrong in both pre and post 2.6.25 kernels. In the recent
    2.6.26-rc series, this is showing up as NULL pointer dereferences or
    sleeping function called from atomic context(__switch_to()), as
    we free and dynamically allocate the FPU context for the newly
    created threads. Older kernels might show some FPU corruption for processes
    running inside of lguest.

    With the appended patch, my test system is running for more than 50 mins
    now. So atleast some of your oops (hopefully all!) should get fixed.
    Please give it a try. I will spend more time with this fix tomorrow.

    Reported-by: Simon Holm Thøgersen
    Reported-by: Patrick McHardy
    Signed-off-by: Suresh Siddha
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     

30 May, 2008

2 commits

  • Anthony Liguori points out that three different transports use the virtio code,
    but each one keeps its own counter to set the virtio_device's index field. In
    theory (though not in current practice) this means that names could be
    duplicated, and that risk grows as more transports are created.

    So we move the selection of the unique virtio_device.index into the common code
    in virtio.c, which has the side-benefit of removing duplicate code.

    The only complexity is that lguest and S/390 use the index to uniquely identify
    the device in case of catastrophic failure before register_virtio_device() is
    called: now we use the offset within the descriptor page as a unique identifier
    for the printks.

    Signed-off-by: Rusty Russell
    Cc: Christian Borntraeger
    Cc: Martin Schwidefsky
    Cc: Carsten Otte
    Cc: Heiko Carstens
    Cc: Chris Lalancette
    Cc: Anthony Liguori

    Rusty Russell
     
  • Thanks to Jon Corbet & LWN. Only took me a day to join the dots.

    Host->Guest netcat before (with unnecessily large receive buffers):
    1073741824 bytes (1.1 GB) copied, 24.7528 seconds, 43.4 MB/s

    After:
    1073741824 bytes (1.1 GB) copied, 17.6369 seconds, 60.9 MB/s

    Signed-off-by: Rusty Russell

    Rusty Russell
     

02 May, 2008

4 commits

  • This brings us closer to Real Life, where we'd examine the device
    features once it's set the DRIVER_OK status bit.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • If lg isn't NULL, and cpu_id is sane, &lg->cpus[cpu_id] can't be NULL.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • NR_CPUS (being a host number) is an arbitrary limit for the Guest.
    Using the array size directly (which currently happes to be NR_CPUS)
    is more futureproof.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • A recent proposed feature addition to the virtio block driver revealed
    some flaws in the API: in particular, we assume that feature
    negotiation is complete once a driver's probe function returns.

    There is nothing in the API to require this, however, and even I
    didn't notice when it was violated.

    So instead, we require the driver to specify what features it supports
    in a table, we can then move the feature negotiation into the virtio
    core. The intersection of device and driver features are presented in
    a new 'features' bitmap in the struct virtio_device.

    Note that this highlights the difference between Linux unsigned-long
    bitmaps where each unsigned long is in native endian, and a
    straight-forward little-endian array of bytes.

    Drivers can still remove feature bits in their probe routine if they
    really have to.

    API changes:
    - dev->config->feature() no longer gets and acks a feature.
    - drivers should advertise their features in the 'feature_table' field
    - use virtio_has_feature() for extra sanity when checking feature bits

    Signed-off-by: Rusty Russell

    Rusty Russell
     

19 Apr, 2008

1 commit


31 Mar, 2008

1 commit


28 Mar, 2008

2 commits


11 Mar, 2008

3 commits

  • Ahmed managed to crash the Host in release_pgd(), which cannot be a Guest
    bug, and indeed it wasn't.

    The bug was that handing a 0 as the address of the toplevel page table
    being manipulated can cause the lookup code in find_pgdir() to return
    an uninitialized cache entry (we shadow up to 4 top level page tables
    for each Guest).

    Commit 37cc8d7f963ba2deec29c9b68716944516a3244f introduced this
    behaviour in the Guest, uncovering the bug.

    The patch which he submitted (which removed the /4 from the index
    calculation) simply ensured that these high-indexed entries hit the
    early exit path of guest_set_pmd(). But you get lots of segfaults in
    guest userspace as the PMDs aren't being updated.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Robert Bragg's 5dc331852848a38ca00a2817e5b98a1d0561b116 tightened
    (ie. fixed) the checking in __get_vm_area, and it broke lguest.

    lguest should pass the exact "end" it wants, not some random constant
    (it was possible previously that it would actually get an address
    different from SWITCHER_ADDR).

    Also, Fabio Checconi pointed out that we should make sure we're not
    hitting the fixmap area.

    Signed-off-by: Rusty Russell
    Cc: Robert Bragg

    Rusty Russell
     
  • If req is LHREQ_INITIALIZE, and the guest has been initialized before
    (unlikely), it will attempt to access cpu->tsk even though cpu is not yet
    initialized.

    Signed-off-by: Eugene Teo
    Signed-off-by: Rusty Russell

    Eugene Teo
     

10 Feb, 2008

1 commit

  • Beginning from commit 4138cc3418f5, ioremap_nocache() sets the _PAGE_PWT
    flag.

    Lguest doesn't accept a guest pte with a _PWT flag and reports a "bad
    page table entry" in that case.

    Accept guest _PAGE_PWT page table entries.

    Signed-off-by: Ahmed S. Darwish
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Ahmed S. Darwish
     

09 Feb, 2008

1 commit

  • Using "attr" twice is not OK, because it effectively prohibits such
    container_of() on variables not named "attr".

    Signed-off-by: Alexey Dobriyan
    Acked-by: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

04 Feb, 2008

3 commits

  • A reset function solves three problems:

    1) It allows us to renegotiate features, eg. if we want to upgrade a
    guest driver without rebooting the guest.

    2) It gives us a clean way of shutting down virtqueues: after a reset,
    we know that the buffers won't be used by the host, and

    3) It helps the guest recover from messed-up drivers.

    So we remove the ->shutdown hook, and the only way we now remove
    feature bits is via reset.

    We leave it to the driver to do the reset before it deletes queues:
    the balloon driver, for example, needs to chat to the host in its
    remove function.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • It seems that virtio_net wants to disable callbacks (interrupts) before
    calling netif_rx_schedule(), so we can't use the return value to do so.

    Rename "restart" to "cb_enable" and introduce "cb_disable" hook: callback
    now returns void, rather than a boolean.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Previously we used a type/len pair within the config space, but this
    seems overkill. We now simply define a structure which represents the
    layout in the config space: the config space can now only be extended
    at the end.

    The main driver-visible changes:
    1) We indicate what fields are present with an explicit feature bit.
    2) Virtqueues are explicitly numbered, and not in the config space.

    Signed-off-by: Rusty Russell

    Rusty Russell
     

31 Jan, 2008

2 commits

  • drivers/lguest/x86/core.c: In function ‘copy_in_guest_info’:
    drivers/lguest/x86/core.c:97: error: ‘struct x86_hw_tss’ has no member named ‘esp1’

    Signed-off-by: Rusty Russell
    Signed-off-by: Linus Torvalds

    Rusty Russell
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (27 commits)
    lguest: use __PAGE_KERNEL instead of _PAGE_KERNEL
    lguest: Use explicit includes rateher than indirect
    lguest: get rid of lg variable assignments
    lguest: change gpte_addr header
    lguest: move changed bitmap to lg_cpu
    lguest: move last_pages to lg_cpu
    lguest: change last_guest to last_cpu
    lguest: change spte_addr header
    lguest: per-vcpu lguest pgdir management
    lguest: make pending notifications per-vcpu
    lguest: makes special fields be per-vcpu
    lguest: per-vcpu lguest task management
    lguest: replace lguest_arch with lg_cpu_arch.
    lguest: make registers per-vcpu
    lguest: make emulate_insn receive a vcpu struct.
    lguest: map_switcher_in_guest() per-vcpu
    lguest: per-vcpu interrupt processing.
    lguest: per-vcpu lguest timers
    lguest: make hypercalls use the vcpu struct
    lguest: make write() operation smp aware
    ...

    Manual conflict resolved (maybe even correctly, who knows) in
    drivers/lguest/x86/core.c

    Linus Torvalds
     

30 Jan, 2008

19 commits