This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: Interrupt to CPU routing in HVM domains - again

To: James Harper <james.harper@xxxxxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] Re: Interrupt to CPU routing in HVM domains - again
From: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date: Fri, 05 Sep 2008 08:43:49 +0100
Cc: bart brooks <bart_brooks@xxxxxxxxxxx>
Delivery-date: Fri, 05 Sep 2008 00:43:54 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AEC6C66638C05B468B556EA548C1A77D01490563@trantor>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AckOoZqF5z/EFE0zSRqumZCNR0aIYwAUMvRgAA4u/S8=
Thread-topic: Interrupt to CPU routing in HVM domains - again
User-agent: Microsoft-Entourage/
I can absolutely assure you that only vcpu0's evtchn_pending flag influences
delivery of the xen_platform pci interrupt. In xen-unstable/xen-3.3, the
relevant function to look at is
xen/arch/x86/hvm/irq.c:hvm_assert_evtchn_irq(). It has always been this way.

In any case, there is only one interrupt line (currently), so there would
hardly be a scalability benefit to spreading out event channel bindings to
different vcpu selector words and evtchn_pending flags. :-)

The only thing I can think of (and should certainly be checked!) is that
some of your event channels are erroneously getting bound to vcpu != 0. Are
you running an irq load balancer or somesuch? Obviously event channels bound
to vcpu != 0 will now never be serviced, whereas before your changes you
would probabilistically 'get lucky'.

 -- Keir

On 5/9/08 02:06, "James Harper" <james.harper@xxxxxxxxxxxxxxxx> wrote:

> (Bart - I hope you don't mind me sending your email to the list)
> Keir,
> As per a recent discussion I modified the IRQ code in the Windows GPLPV
> drivers so that only the vcpu_info[0] structure is used, instead of
> vcpu_info[current_cpu] structure. As per Bart's email below though, this
> has caused him to experience performance issues.
> Have I understood correctly that only cpu 0 of the vcpu_info[] array is
> ever used even if the interrupt actually occurs on another vcpu? Is this
> true for all versions of Xen? It seems that Bart's experience is exactly
> the opposite of mine - the change that fixed up the performance issues
> for me caused performance issues for him...
> Bart: Can you have a look through the xen-devel list archives and have a
> read of a thread with a subject of "HVM windows - PCI IRQ firing on both
> CPU's", around the middle of last month? Let me know if you interpret
> that any differently to me...
> Thanks
> James
>> -----Original Message-----
>> From: bart brooks [mailto:bart_brooks@xxxxxxxxxxx]
>> Sent: Friday, 5 September 2008 01:19
>> To: James Harper
>> Subject: Performance - Update GPLPV drivers -0.9.11-pre12
>> Importance: High
>> Hi James,
>> We have tracked down the issue where performance has dropped off after
>> version 0.9.11-pre9 and still exists in version 0.9.11-pre12.
>> Event channel interrupts for transmit are generated only on VCPU-0,
>> whereas for receive they are generated on all VCPUs in a round robin
>> fashion. Post 0.9.11-pre9 it is assumed that all the interrupts are
>> generated on VCPU-0, so the network interrupts generated on other
>> are only processed if there is some activity going on VCPU-0 or an
>> outstanding DPC. This caused the packets to be processed out-of-order
> and
>> retransmissions. Retransmissions happened after a timeout (200ms) with
> no
>> activity during that time. Overall it bought down the bandwidth a lot
> with
>> huge gaps of no activity.
>> Instead of assuming that everything is on CPU-0, the following change
> was
>> made in the xenpci driver in the file evtchn.c in the function
>> EvtChn_Interrupt()
>> int cpu = KeGetCurrentProcessorNumber() & (MAX_VIRT_CPUS - 1);
>> This is the same code found in version  0.9.11-pre9
>> After this change, we are getting numbers comparable to 0.9.11-pre9 .
>> Bart
>> ________________________________
>> Get more out of the Web. Learn 10 hidden secrets of Windows Live.
> Learn
>> Now <http://windowslive.com/connect/post/jamiethomson.spaces.live.com-
>> Blog-cns!550F681DAD532637!5295.entry?ocid=TXT_TAGLM_WL_getmore_092008>

Xen-devel mailing list