WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Sles9.3 HVM guest block

To: "Woller, Thomas" <thomas.woller@xxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] Sles9.3 HVM guest block
From: Keir Fraser <keir@xxxxxxxxxxxxx>
Date: Tue, 20 Feb 2007 17:35:22 +0000
Cc: "Wilson, Stephen" <Stephen.Wilson@xxxxxxx>
Delivery-date: Tue, 20 Feb 2007 09:34:47 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <683860AD674C7348A0BF0DE3918482F6045DB4A9@xxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcdVEJN9twCKDJwESkePMcO9nEpUbAABOu2q
Thread-topic: [Xen-devel] Sles9.3 HVM guest block
User-agent: Microsoft-Entourage/11.2.5.060620
On 20/2/07 17:00, "Woller, Thomas" <thomas.woller@xxxxxxx> wrote:

> During regression of testing 32b UP SLES9.3/SUSE10 HVM guests on 64b hv,
> we are seeing a problem with the guest becoming permanently blocked (b
> state).  Blockage occurs at fairly random times... booting, fsck,
> ltp/cerberos - on both AMD-V and VT, and takes from 5 minutes to many
> hours to fail.  Last c/s tested was 13947 that we see the problem.
> We've traced it back to changeset 13320.  if we boot the guest with
> hpet=disabled, then the guest runs without problem (tested 48 hours w/o
> failure).  Adding the "vcpu_kick" line removed with c/s 13320 also
> alleviates the problem (24 hours w/o failure).
> Let me know if you need any more details concerning the guest
> configuration or host machine, or if you believe/need alternate testing
> parms would be useful, and we can run additional tests.

Thanks for tracking this one down to the HPET logic. However, reinstating
this changeset is not really the correct fix. A vcpu_kick() may rescue
otherwise-lost VCPUs I suppose, but there's no logical reason that it should
be necessary. Any necessary wakeup should occur via an interrupt delivery
from hpet_route_interrupt().

After all, there's no point in waking up a VCPU unless it has work to do,
which will usually mean that you are in the process of delivering it an
interrupt (hence the vcpu_kick() invocations in vpic.c, vioapic.c and
vlapic.c). The invocation in vpt.c is actually correct because it is tied up
in the pending_intr_nr logic which gets checked in the exit-to-guest path of
a woken VCPU.

It's worth trying to grab some more info about a guest when it hangs: How
are the HPET timers configured? In particular, how should interrupts be
delivered? Does it look like an interrupt has been delivered but not
notified? Etc.

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>