Re: [Xen-devel] Race in vlapic init_sipi tasklet

To:	George Dunlap <dunlapg@xxxxxxxxx>
Subject:	Re: [Xen-devel] Race in vlapic init_sipi tasklet
From:	Keir Fraser <keir@xxxxxxx>
Date:	Wed, 20 Oct 2010 10:00:06 +0100
Cc:	xen-devel@xxxxxxxxxxxxxxxxxxx, Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Delivery-date:	Wed, 20 Oct 2010 02:01:56 -0700
Dkim-signature:	v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:user-agent:date :subject:from:to:cc:message-id:thread-topic:thread-index:in-reply-to :mime-version:content-type:content-transfer-encoding; bh=aL2AFjbXN25L+w2iCmsI3WzsZhOJUymFZhU8zf+hmn4=; b=F/f4rtg9Q4MFdUCa2f2j/ooyDOYhM7NHHFXcs/NrG9k9h2SH9UJbydKKHrmScAaIAl 4jhioA8bcyjglsLPFziI0LqJK28orhobBd888KF62HHwGytMl7Bi4wXkmvAylATTOM8t 6pJri4pSXwfynNoSoRikW345jatr9pllPT5Vc=
Domainkey-signature:	a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:user-agent:date:subject:from:to:cc:message-id:thread-topic :thread-index:in-reply-to:mime-version:content-type :content-transfer-encoding; b=bdRXvYEObcpzEhbNFHP2EllV0KQ4OzAntRHRf79jL6PfQ3sqNZQE11Oke8HXQNPHl1 5axhWi55BhQjcLTivjVCTyUD9NPq1MWGOvw1sLjDqftO7hPciWB+xbpIA1VbuaMLaR05 dOKFye8HZtCR8Jauwb+SL8u8GLK+YF+9a+xp0=
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<AANLkTi=uoQBZNuHoidNTYE6Qc-KmYMWnMFq1ZQqOJUD5@xxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index:	ActwNTCIBrb5jwVnF0q74DIRKif3yQ==
Thread-topic:	[Xen-devel] Race in vlapic init_sipi tasklet
User-agent:	Microsoft-Entourage/12.27.0.100910

It turned out there is a really simple hypervisor fix for this which I
checked in as c/s 22265. By the way I tested this with "sched=credit2
maxcpus=1" and in that configuration the next thing of note is that the
xenstore interactions in hvmloader run like a dog. Probably something to do
with hvmloader polling via yield, and some bad interaction with credit2's
handling of yield. I seemed to get stuck with no VCPUs running at all, and
dom0 unresponsive, which was weird. Probably hvmloader should really be
using SCHEDOP_poll and properly waiting on its end of the event channel. But
still there is obviosuly some fishy scheduling issue here regardless of
hvmloader's current naivety.

 -- Keir

On 18/10/2010 18:16, "George Dunlap" <dunlapg@xxxxxxxxx> wrote:

> I've been tracking down a bug where a multi-vcpu VM hangs in the
> hvmloader on credit2, but not on credit1.  It hangs while trying to
> bring up extra cpus.
> 
> It turns out that an unintended quirk in credit2 (some might call it a
> bug) causes a scheduling order which exposes a race in the vlapic
> init_sipi tasklet handling code.
> 
> The code as it stands right now, is meant to do this:
> * v0 does an APIC ICR write with APIC_DM_STARTUP, trapping to Xen.
> * vlapic code checks to see that v1 is down (vlapic.c:318); finds that
> it is down, and schedules the tasklet, returning X86_EMUL_RETRY
> (vlapic.c:270)
> * Taslket runs, brings up v1.
> * v1 starts running.
> * v0 re-executes the instruction, finds that v1 is up, and returns
> X86_EMUL_OK, allowing the instruction to move forward.
> * v1 does some diagnostics, and takes itself offline.
> 
> Unfortunately, the credit2 scheduler almost always preempts v0
> immediately, allowing v1 to run to completion and bring itself back
> offline again, before v0 can re-try the MMIO.  It looks like this:
> * v0 does APIC ICR APIC_DM_STARTUP write, trapping to Xen.
> * vlapic code checks to see that v1 is down; finds that it is down,
> schedules the tasklet, returns X86_EMUL_RETRY
> * Tasklet runs, brings up v1
> * Credit 2 pre-empts v0, allowing v1 to run
> * v1 starts running
> * v1 does some diagnostics, and takes itself offline.
> * v0 re-executes the instruction, finds that v1 is down, and again
> schedules the tasklet and returns X86_EMUL_RETRY.
> * For some reason the tasklet doesn't actually bring up v1 again
> (presumably because it hasn't had an APIC_DM_INIT again); so v0 is
> stuck doing X86_EMUL_RETRY forever.
> 
> The problem is that VPF_down is used as the test to see if the tasklet
> has finished its work; but there's no guarantee that the scheduler
> will run v0 before v1 has come up and gone back down again.
> 
> I discussed this with Tim, and we agreed that we should ask you.
> 
> One option would be to simply make vlapic_schedule_sipi_init_ipi()
> always return X86_EMUL_OK, but we weren't sure if that might cause
> some other problems.
> 
> The "right" solution, if synchronization is needed, is to have an
> explicit signal sent back that the instruction can be allowed to
> complete, rather than relying on reading VPF_down, which may cause
> races.
> 
> Thoughts?
> 
>  -George
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] Race in vlapic init_sipi tasklet