WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Domain 0 stop response on frequently reboot VMS

To: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>, xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] Domain 0 stop response on frequently reboot VMS
From: Keir Fraser <keir@xxxxxxx>
Date: Sat, 16 Oct 2010 08:16:51 +0100
Cc:
Delivery-date: Sat, 16 Oct 2010 00:17:38 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:user-agent:date :subject:from:to:message-id:thread-topic:thread-index:in-reply-to :mime-version:content-type:content-transfer-encoding; bh=VAOlY7gIELjDYTpYVEWmJkfpnQztrAQGFMOkhjREnYI=; b=RRnY/4ysC9qLZt7UztPl0DTAmdQnkIhdcza5lksOhRTtxSfMtQR0SaClIdyNPTDku/ pEZ1iKG8iCtDsn1CfKLwCBY2iGGi/ZKo/PCTD7R+o0Fff+e6WYluJuBbcGprgQGuomCK urHpAikDwtTWENhT8ZPRpAlLG5K10C6Oxt2s4=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:user-agent:date:subject:from:to:message-id:thread-topic :thread-index:in-reply-to:mime-version:content-type :content-transfer-encoding; b=htcLA2WsoDW4Sm4oqu3LAzgkKNYaFR3DMt6K07iwwDYjyyRl8fKMiGema6qwh2nt07 BrKyj6y2MlEkXu5i6W0PCM8NgXBNpMQ+s8ZR7c7+8Va32jIBDeBhcezJg2mWACsUdYe8 5tm8Q5B8lV8fK5uNJjlETcNEhyhcCZSxa3hbk=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <BLU157-w8114EC5EB660DA26E51B9DA580@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: ActtAhpfxa7BDfC+3UC4aGDkQs35qw==
Thread-topic: [Xen-devel] Domain 0 stop response on frequently reboot VMS
User-agent: Microsoft-Entourage/12.26.0.100708
Send a patch to the list, Cc Jeremy Fitzhardinge and also a blktap
maintainer, which you should be able to derive from changeset histories and
signed-off-by lines. Flag it clearly in the subject line as a proposed
bugfix for pv_ops.

 -- Keir

On 16/10/2010 06:39, "MaoXiaoyun" <tinnycloud@xxxxxxxxxxx> wrote:

> Well, Thanks Keir.
> Fortunately we caught the bug, it turned out to be a tapdisk problem.
> A brief explaination for other guys might confront this issue.
>  
> Clear  BLKTAP_DEFERRED on line 19 will lead to the concurrent access of
> tap->deferred_queue between line 24 and 37, which will finally cause bad
> pointer of tap->deferred_queue, and infinte loop in while clause in line 22.
> Lock line 24 will be a simple fix.
>  
> /linux-2.6-pvops.git/drivers/xen/blktap/wait_queue.c
>   9 void
>  10 blktap_run_deferred(void)
>  11 {
>  12     LIST_HEAD(queue);
>  13     struct blktap *tap;
>  14     unsigned long flags;
>  15     
>  16     spin_lock_irqsave(&deferred_work_lock, flags);
>  17     list_splice_init(&deferred_work_queue, &queue);
>  18     list_for_each_entry(tap, &queue, deferred_queue)
>  19         clear_bit(BLKTAP_DEFERRED, &tap->dev_inuse);
>  20     spin_unlock_irqrestore(&deferred_work_lock, flags);
>  21     
>  22     while (!list_empty(&queue)) {
>  23         tap = list_entry(queue.next, struct blktap, deferred_queue);
>  24 &nb sp;       list_del_init(&tap->deferred_queue);
>  25         blktap_device_restart(tap);
>  26     }   
>  27 }   
>  28 
>  29 void
>  30 blktap_defer(struct blktap *tap)
>  31 {
>  32     unsigned long flags;
>  33     
>  34     spin_lock_irqsave(&deferred_work_lock, flags);
>  35     if (!test_bit(BLKTAP_DEFERRED, &tap->dev_inuse)) {
>  36         set_bit(BLKTAP_DEFERRED, &tap->dev_inuse);
>  37         list_add_tail(&tap->deferred_queue, &deferred_work_queue);
>  38     }   
>  39     spin_unlock_irqrestore(&deferred_work_lock,! f lags);
>  40 } 
> 
>  
>> Date: Fri, 15 Oct 2010 13:57:09 +0100
>> Subject: Re: [Xen-devel] Domain 0 stop response on frequently reboot VMS
>> From: keir@xxxxxxx
>> To: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>> 
>> You'll probably want to see if you can get SysRq output from dom0 via serial
>> line. It's likely you can if it is alive enough to respond to ping. This
>> might tell you things like what all processes are getting blocked on, and
>> thus indicate what is stopping dom0 from making progress.
>> 
>> -- Keir
>> 
>> On 15/10/2010 13:43, "MaoXiaoyun" <tinnycloud@xxxxxxxxxxx> wrote:
>> 
>>> 
>>> Hi Keir:
>>> 
>>> First, I'd like to express my appreciation for the help your offered
>>> before.
>>> Well, recently we confront a rather nasty domain 0 no response
>>> problem.
>>> 
>>> We still have 12 HVMs almost continuously and con currently reboot
>>> test on a physical server.
>>> A few hours later, the server looks like dead. We only can ping to
>>> the server and get right response,
>>> the Xen works fine since we can get debug info from serial port. Attached is
>>> the full debug output.
>>> After decode the domain 0 CPU stack, I find the CPU still works for domain 0
>>> since the stack changed
>>> info changed every time I dumped.
>>> 
>>> Could help to take a look at the attentchment to see whether there are
>>> some hints for debugging this
>>> problem. Thanks in advance.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>> http://lists.xensource.com/xen-devel
>> 
>> 
>        !



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel