|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] Sles9.3 HVM guest block
On 20/2/07 17:00, "Woller, Thomas" <thomas.woller@xxxxxxx> wrote:
> During regression of testing 32b UP SLES9.3/SUSE10 HVM guests on 64b hv,
> we are seeing a problem with the guest becoming permanently blocked (b
> state). Blockage occurs at fairly random times... booting, fsck,
> ltp/cerberos - on both AMD-V and VT, and takes from 5 minutes to many
> hours to fail. Last c/s tested was 13947 that we see the problem.
> We've traced it back to changeset 13320. if we boot the guest with
> hpet=disabled, then the guest runs without problem (tested 48 hours w/o
> failure). Adding the "vcpu_kick" line removed with c/s 13320 also
> alleviates the problem (24 hours w/o failure).
> Let me know if you need any more details concerning the guest
> configuration or host machine, or if you believe/need alternate testing
> parms would be useful, and we can run additional tests.
Thanks for tracking this one down to the HPET logic. However, reinstating
this changeset is not really the correct fix. A vcpu_kick() may rescue
otherwise-lost VCPUs I suppose, but there's no logical reason that it should
be necessary. Any necessary wakeup should occur via an interrupt delivery
from hpet_route_interrupt().
After all, there's no point in waking up a VCPU unless it has work to do,
which will usually mean that you are in the process of delivering it an
interrupt (hence the vcpu_kick() invocations in vpic.c, vioapic.c and
vlapic.c). The invocation in vpt.c is actually correct because it is tied up
in the pending_intr_nr logic which gets checked in the exit-to-guest path of
a woken VCPU.
It's worth trying to grab some more info about a guest when it hangs: How
are the HPET timers configured? In particular, how should interrupts be
delivered? Does it look like an interrupt has been delivered but not
notified? Etc.
-- Keir
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|