WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split

To: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split
From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
Date: Tue, 08 Feb 2011 13:23:08 +0100
Cc: Andre Przywara <andre.przywara@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Diestelhorst, Stephan" <Stephan.Diestelhorst@xxxxxxx>
Delivery-date: Tue, 08 Feb 2011 04:24:57 -0800
Dkim-signature: v=1; a=rsa-sha256; c=simple/simple; d=ts.fujitsu.com; i=juergen.gross@xxxxxxxxxxxxxx; q=dns/txt; s=s1536b; t=1297167790; x=1328703790; h=message-id:date:from:mime-version:to:cc:subject: references:in-reply-to:content-transfer-encoding; z=Message-ID:=20<4D5135AC.6090007@xxxxxxxxxxxxxx>|Date:=20 Tue,=2008=20Feb=202011=2013:23:08=20+0100|From:=20Juergen =20Gross=20<juergen.gross@xxxxxxxxxxxxxx>|MIME-Version: =201.0|To:=20George=20Dunlap=20<George.Dunlap@xxxxxxxxxxx om>|CC:=20Andre=20Przywara=20<andre.przywara@xxxxxxx>,=20 =0D=0A=20"xen-devel@xxxxxxxxxxxxxxxxxxx"=20<xen-devel@lis ts.xensource.com>,=0D=0A=20"Diestelhorst,=20Stephan"=20<S tephan.Diestelhorst@xxxxxxx>|Subject:=20Re:=20[Xen-devel] =20Hypervisor=20crash(!)=20on=20xl=20cpupool-numa-split |References:=20<4D41FD3A.5090506@xxxxxxx>=09<201102021539 .06664.stephan.diestelhorst@xxxxxxx>=09<4D4974D1.1080503@ ts.fujitsu.com>=09<201102021701.05665.stephan.diestelhors t@xxxxxxx>=09<4D4A43B7.5040707@xxxxxxxxxxxxxx>=20<4D4A72D 8.3020502@xxxxxxxxxxxxxx>=09<4D4C08B6.30600@xxxxxxx>=20<4 D4FE7E2.9070605@xxxxxxx>=09<4D4FF452.6060508@xxxxxxxxxxxx om>=09<AANLkTinoRUQC_suVYFM9-x3D00KvYofq3R=3DXkCQUj6RP@ma il.gmail.com>=09<4D50D80F.9000007@xxxxxxxxxxxxxx>=20<AANL kTinKJUAXhiXpKui_XX8XCD6T5fmzNARwHE6Fjafv@xxxxxxxxxxxxxx> |In-Reply-To:=20<AANLkTinKJUAXhiXpKui_XX8XCD6T5fmzNARwHE6 Fjafv@xxxxxxxxxxxxxx>|Content-Transfer-Encoding:=207bit; bh=tnnLb1O9Yhd1eldQ5GZ7aJYDjoQ74rdKFnLAJllaYbE=; b=RySL+BCp/yZaigSuHVe0YjtY9eAlmupHJyidGTDAzPdLRMACNgQ2aTHe KivcJ1F4009swLnplu5mY0iuyTTvBqnka5Wm1hkwusHmkzmqTiZZNLb5F cQ07k1775IDprhWuY7yVyVg/8KhFcaWghiw20luqec5J4GgyoAe4R+m4A R2w0emeBueK5VwrnLSK69JDjqWBMisYAK/0jy90oPDtvYh06YIs7PR45K OP4kQ8wvdSRchIM3kBUpWGJmcNVHg;
Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=WIXRNRzgqtULKcgEUYxSi0zyuXKrr34XhE1Ci9SYyQPY5Ao/cM8tA8PI vkeV/O5Wlj+7ARa/QPkWF3I8EMq46tIulmu1PfVcYOCs5elOjFe4vY32n IcVF/NE2ia4phXgVdarujsVZX3+Jpl6Q5UyrURElippdvHmYDbHpPWRpK UY+0eHXf5dhUpchduwGhfp36svfvKVDz17XnJoV65grgAdbgKbUdP7oH2 coMqG0WRLPkNEljRM658ZiZIK/CWj;
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTinKJUAXhiXpKui_XX8XCD6T5fmzNARwHE6Fjafv@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Fujitsu Technology Solutions
References: <4D41FD3A.5090506@xxxxxxx> <201102021539.06664.stephan.diestelhorst@xxxxxxx> <4D4974D1.1080503@xxxxxxxxxxxxxx> <201102021701.05665.stephan.diestelhorst@xxxxxxx> <4D4A43B7.5040707@xxxxxxxxxxxxxx> <4D4A72D8.3020502@xxxxxxxxxxxxxx> <4D4C08B6.30600@xxxxxxx> <4D4FE7E2.9070605@xxxxxxx> <4D4FF452.6060508@xxxxxxxxxxxxxx> <AANLkTinoRUQC_suVYFM9-x3D00KvYofq3R=XkCQUj6RP@xxxxxxxxxxxxxx> <4D50D80F.9000007@xxxxxxxxxxxxxx> <AANLkTinKJUAXhiXpKui_XX8XCD6T5fmzNARwHE6Fjafv@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.16) Gecko/20101226 Iceowl/1.0b1 Icedove/3.0.11
On 02/08/11 13:08, George Dunlap wrote:
On Tue, Feb 8, 2011 at 5:43 AM, Juergen Gross
<juergen.gross@xxxxxxxxxxxxxx>  wrote:
On 02/07/11 16:55, George Dunlap wrote:

Juergen,

What is supposed to happen if a domain is in cpupool0, and then all of
the cpus are taken out of cpupool0?  Is that possible?

No. Cpupool0 can't be without any cpu, as Dom0 is always member of cpupool0.

If that's the case, then since Andre is running this immediately after
boot, he shouldn't be seeing any vcpus in the new pools; and all of
the dom0 vcpus should be migrated to cpupool0, right?  Is it possible
that migration process isn't happening properly?

Again: not the vcpus are migrated to cpupool0, but the physical cpus are
taken away from it, so the vcpus being active on the cpu to be moved MUST
be migrated to other cpus of cpupool0.


It looks like schedule.c:cpu_disable_scheduler() will try to migrate
all vcpus, and if it fails to migrate, it returns -EAGAIN so that the
tools will try again.  It's probably worth instrumenting that whole
code-path to make sure it actually happens as we expect.  Are we
certain, for example, that if a hypercall continued on another cpu
will actually return the new error value properly?

I have checked that and did never see any problem. And yes, I did see
the EAGAIN case happen.
With my test patch to execute the cpu_disable_scheduler() always on the
cpu to be moved this should not be a problem at all, since the tasklet
is always running in the idle vcpu.


Another minor thing: In cpupool.c:cpupool_unassign_cpu_helper(), why
is the cpu's bit set in cpupool_free_cpus without checking to see if
the cpu_disable_scheduler() call actually worked?  Shouldn't that also
be inside the if() statement?

No, I don't think so. If removing a cpu fails permanently after returning
-EAGAIN before, it should be addable to the original cpupool easily. This can
only be done, if it is flagged as free. Adding it to another cpupool will be
denied as cpupool_cpu_moving is still set.


Juergen

--
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@xxxxxxxxxxxxxx
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel