I don't think this is the right fix, but it does highlight the issue.
While killing a domain, the vcpus are descheduled, but every
now and then, one of the cpus is still running one of the vcpus, which
means d->cpumask is not empty. This triggers the BUG_ON() in
xen/arch/x86/domain.c:domain_relinquish_resources(). The patch puts in
some printks and a cpu_relax() loop till the cpumask is empty before
calling domain_relinquish_resources(). With this patch, I've gone
through several thousand iterations of create/destroy without crashing.
--
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253 T/L: 678-9253
ryanh@xxxxxxxxxx
diffstat output:
domain.c | 8 ++++++++
1 files changed, 8 insertions(+)
Signed-off-by: Ryan Harper <ryanh@xxxxxxxxxx>
---
diff -r 413c911e5780 xen/common/domain.c
--- a/xen/common/domain.c Mon Sep 12 12:48:33 2005
+++ b/xen/common/domain.c Mon Sep 12 13:25:07 2005
@@ -112,6 +112,14 @@
{
for_each_vcpu(d, v)
sched_rem_domain(v);
+ if(!cpus_empty(d->cpumask)) {
+ printk("ACK! DOM%d still running, waiting before dying\n",
+ d->domain_id);
+ while(!cpus_empty(d->cpumask))
+ cpu_relax();
+ printk("DOM%d cpumask clear, relinquishing resources\n",
+ d->domain_id);
+ }
domain_relinquish_resources(d);
put_domain(d);
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|