WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Re: Still struggling with HVM: tx timeouts on emulated n

To: Stefan Bader <stefan.bader@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] Re: Still struggling with HVM: tx timeouts on emulated nics
From: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
Date: Thu, 22 Sep 2011 18:44:31 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, Stefano Stabellini <Stefano.Stabellini@xxxxxxxxxxxxx>
Delivery-date: Thu, 22 Sep 2011 10:45:39 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4E7B4768.8060103@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4E7B4768.8060103@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Alpine 2.00 (DEB 1167 2008-08-23)
On Thu, 22 Sep 2011, Stefan Bader wrote:
> On 22.09.2011 13:58, Stefan Bader wrote:
> > On 22.09.2011 12:30, Stefano Stabellini wrote:
> >> On Wed, 21 Sep 2011, Stefan Bader wrote:
> >>> On 21.09.2011 15:31, Stefano Stabellini wrote:
> >>>> On Wed, 21 Sep 2011, Stefan Bader wrote:
> >>>>> This is on 3.0.4 based dom0 and domU with 4.1.1 hypervisor. I tried 
> >>>>> using the
> >>>>> default 8139cp and ne2k_pci emulated nic. The 8139cp one at least comes 
> >>>>> up and
> >>>>> gets configured via dhcp. And initial pings also get routed and done 
> >>>>> correctly.
> >>>>> But slightly higher traffic (like checking for updates) hangs. And 
> >>>>> after a while
> >>>>> there are messages about tx timeouts.
> >>>>> The ne2k_pci type nic almost immediately has those issues and never 
> >>>>> comes up
> >>>>> correctly.
> >>>>>
> >>>>> I am attaching the dmesg of the guest with apic=debug enabled. I am not 
> >>>>> sure how
> >>>>> this should be but both nics get configured with level,low IRQs. Disk 
> >>>>> emulation
> >>>>> seems to be ok but that seem to use IO-APIC-edge. And any other IRQs 
> >>>>> seem to be
> >>>>> at least not level.
> >>>>
> >>>
> >>>> Does the e1000 emulated card work correctly?
> >>>
> >>> Yes, that one seems to work ok.
> >>>
> >>>> What happens if you disable interrupt remapping (see patch below)?
> >>>
> >>> 8139cp seems to work correctly now (much higher irq stats as well) and 
> >>> e1000
> >>> still works. Both then using IOAPIC-fasteoi.
> >>>
> >>
> >> That means there must be another subtle bug in Xen in interrupt
> >> remapping that only affects 8139p emulation
> >>
> > Right, or to be complete:
> > - e1000: ok
> > - 8139cp: unstable (setup is possible)
> > - ne2k_pci: not working (tx problems from the beginning)
> > 
> > The behaviour feels a bit like interrupts may get lost if occurring at a 
> > higher
> > rate. Why this affects various drivers differently is a bit weird.
> >>
> 
> This is mainly speculating... Quite a while back there was this patch to 
> events:
> 
> commit dffe2e1e1a1ddb566a76266136c312801c66dcf7
> Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@xxxxxxxxxx>
> Date:   Fri Aug 20 19:10:01 2010 -0700
> 
>     xen: handle events as edge-triggered
> 
> The commit message stated that Xen events are logically edge triggered. So PV
> events were changed to be handled as edge interrupts. Would that not mean that
> for xen-pirq-apic being using events this would apply the same and those 
> should
> be apic-edge instead of level?

That commit is referring to the internal way Linux handles these event,
that look like normal interrupt to the Linux irq subsystem. It is not
related to the way actual events are delivered from Xen to Linux, so it
shouldn't matter here.

I would add lots of printk's in:

xen/arch/x86/hvm/irq.c:__hvm_pci_intx_assert
xen/arch/x86/hvm/irq.c:assert_irq
xen/arch/x86/hvm/irq.c:assert_gsi

to find out why xen is not injecting those interrupts

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel