This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Dom0 hang problem

To: <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] Dom0 hang problem
From: "Subrahmanian, Raj" <raj.subrahmanian@xxxxxxxxxx>
Date: Wed, 27 Sep 2006 18:26:22 -0400
Delivery-date: Wed, 27 Sep 2006 15:27:11 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <451A9B2E.DD28.00D3.0@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acbifhee5BPl2WZzTjmuvQ4OpFVXSQABN5nw
Thread-topic: Dom0 hang problem

I have been running on a 8-way 32 GB ES7000 system. 
I am on the tip of the xen-unstable tree (changeset 11627).
To test xen, I ran 4 DomUs 2 of them were 8-way, with 10 GB memory and 2
others were 4-way with 1 GB of memory. The DomUs come up, run kernbench
and ltp and shutdown. After about 3 hours of running, I tried to do an
xm list and the machine locked up. 

I could not ssh into the box, but I could ping it. I could get data from
the serial machine. Leaving the machine untouched for a long time does
not alleviate the problem. 

The 8-way DomUs had completed their tests at this point and the 4-way
domUs had finished their kernbench tests and were running LTP.

Has anyone else seen issues like this?
How can I debug this problem?

I have attached the before and after info from the serial machine. run
queues, memory info, VM info etc.

I discovered this problem when I was giving Ryan Harper's NUMA patches a
spin last week. 
Further investigation revealed that the issue was *not* with the NUMA
patches, but occurs in the mainline xen-unstable kernel. 

Xen Development Team
Unisys Corp.

Attachment: after.txt
Description: after.txt

Attachment: before.txt
Description: before.txt

Attachment: bootup.txt
Description: bootup.txt

Xen-devel mailing list