[Xen-devel] Re: Interrupt to CPU routing in HVM domains - again

To:	James Harper <james.harper@xxxxxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject:	[Xen-devel] Re: Interrupt to CPU routing in HVM domains - again
From:	Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date:	Fri, 05 Sep 2008 08:43:49 +0100
Cc:	bart brooks <bart_brooks@xxxxxxxxxxx>
Delivery-date:	Fri, 05 Sep 2008 00:43:54 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<AEC6C66638C05B468B556EA548C1A77D01490563@trantor>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index:	AckOoZqF5z/EFE0zSRqumZCNR0aIYwAUMvRgAA4u/S8=
Thread-topic:	Interrupt to CPU routing in HVM domains - again
User-agent:	Microsoft-Entourage/11.4.0.080122

I can absolutely assure you that only vcpu0's evtchn_pending flag influences
delivery of the xen_platform pci interrupt. In xen-unstable/xen-3.3, the
relevant function to look at is
xen/arch/x86/hvm/irq.c:hvm_assert_evtchn_irq(). It has always been this way.

In any case, there is only one interrupt line (currently), so there would
hardly be a scalability benefit to spreading out event channel bindings to
different vcpu selector words and evtchn_pending flags. :-)

The only thing I can think of (and should certainly be checked!) is that
some of your event channels are erroneously getting bound to vcpu != 0. Are
you running an irq load balancer or somesuch? Obviously event channels bound
to vcpu != 0 will now never be serviced, whereas before your changes you
would probabilistically 'get lucky'.

 -- Keir

On 5/9/08 02:06, "James Harper" <james.harper@xxxxxxxxxxxxxxxx> wrote:

> (Bart - I hope you don't mind me sending your email to the list)
> 
> Keir,
> 
> As per a recent discussion I modified the IRQ code in the Windows GPLPV
> drivers so that only the vcpu_info[0] structure is used, instead of
> vcpu_info[current_cpu] structure. As per Bart's email below though, this
> has caused him to experience performance issues.
> 
> Have I understood correctly that only cpu 0 of the vcpu_info[] array is
> ever used even if the interrupt actually occurs on another vcpu? Is this
> true for all versions of Xen? It seems that Bart's experience is exactly
> the opposite of mine - the change that fixed up the performance issues
> for me caused performance issues for him...
> 
> Bart: Can you have a look through the xen-devel list archives and have a
> read of a thread with a subject of "HVM windows - PCI IRQ firing on both
> CPU's", around the middle of last month? Let me know if you interpret
> that any differently to me...
> 
> Thanks
> 
> James
> 
> 
> 
>> -----Original Message-----
>> From: bart brooks [mailto:bart_brooks@xxxxxxxxxxx]
>> Sent: Friday, 5 September 2008 01:19
>> To: James Harper
>> Subject: Performance - Update GPLPV drivers -0.9.11-pre12
>> Importance: High
>> 
>> Hi James,
>> 
>> 
>> 
>> We have tracked down the issue where performance has dropped off after
>> version 0.9.11-pre9 and still exists in version 0.9.11-pre12.
>> 
>> Event channel interrupts for transmit are generated only on VCPU-0,
>> whereas for receive they are generated on all VCPUs in a round robin
>> fashion. Post 0.9.11-pre9 it is assumed that all the interrupts are
>> generated on VCPU-0, so the network interrupts generated on other
> VPCUs
>> are only processed if there is some activity going on VCPU-0 or an
>> outstanding DPC. This caused the packets to be processed out-of-order
> and
>> retransmissions. Retransmissions happened after a timeout (200ms) with
> no
>> activity during that time. Overall it bought down the bandwidth a lot
> with
>> huge gaps of no activity.
>> 
>> 
>> 
>> Instead of assuming that everything is on CPU-0, the following change
> was
>> made in the xenpci driver in the file evtchn.c in the function
>> EvtChn_Interrupt()
>> 
>> int cpu = KeGetCurrentProcessorNumber() & (MAX_VIRT_CPUS - 1);
>> 
>> This is the same code found in version  0.9.11-pre9
>> 
>> 
>> 
>> After this change, we are getting numbers comparable to 0.9.11-pre9 .
>> 
>> Bart
>> 
>> 
>> ________________________________
>> 
>> Get more out of the Web. Learn 10 hidden secrets of Windows Live.
> Learn
>> Now <http://windowslive.com/connect/post/jamiethomson.spaces.live.com-
>> Blog-cns!550F681DAD532637!5295.entry?ocid=TXT_TAGLM_WL_getmore_092008>



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] Re: Interrupt to CPU routing in HVM domains - again