WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Dom0 losing interrupts???

To: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] Dom0 losing interrupts???
From: André Przywara <andre.przywara@xxxxxxx>
Date: Mon, 14 Feb 2011 09:58:42 +0100
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 14 Feb 2011 01:00:55 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4D58D2D7.9010803@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: AMD
References: <4D58D2D7.9010803@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101208 Thunderbird/3.1.7
Am 14.02.2011 07:59, schrieb Juergen Gross:
Hi,

while trying to reproduce Andre's cpupool problem I ran into another issue:

Dom0 seems to lose hardware interrupts when it has more vcpus than pcpus
available. First I thought this could be due to my cpupool patches, but the
problem can be easily reproduced by pinning all Dom0 vcpus to a few physical
cpus and doing a parallel build then.

I used xen-unstable, kernel 2.6.32.24 from SLES11 SP1 on a 12 core INTEL
nehalem machine. I pinned all 12 Dom0 vcpus to pcpu 1-2 and started a parallel
build. After about 2 minutes the first missing interrupts were reported, a
little bit later the next one, no xen messages are printed:

[230644.814834] ata1: lost interrupt (Status 0x50)
[230682.814399] ata1: lost interrupt (Status 0x50)
[230690.814467] ata1: lost interrupt (Status 0x58)
...
[230856.718437] sd 4:2:0:0: [sda] megasas: RESET -843713 cmd=2a retries=0
[230856.739457] megaraid_sas: HBA reset handler invoked without an internal
reset condition.
[230856.766435] megasas: [ 0]waiting for 16 commands to complete

Has anyone observed a similar behavior?

Yes, me again:-)

On the rare occasions where I couldn't trigger the bug (like when using a restricted Dom0) I observed interrupt problems, which mostly killed the network connection:
(XEN) do_IRQ: 0.89 No irq handler for vector (irq -1)
I could solve this issue temporarily be down-ing and up-ing the network interface, but the box became unstable later. hypervisor and tools c/s 22858, Dom0 latest tip of PVOPS xen/stable-2.6.32.x (2.6.32.27)

Regards,
Andre.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel