WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] XEN 4.0 + 2.6.31.13 pvops kernel : system crashes on sta

To: "'Konrad Rzeszutek Wilk'" <konrad.wilk@xxxxxxxxxx>, "'John McCullough'" <jmccullo@xxxxxxxxxxx>
Subject: RE: [Xen-devel] XEN 4.0 + 2.6.31.13 pvops kernel : system crashes on starting 155th domU
From: "Yuvraj Agarwal" <yuvraj@xxxxxxxxxxx>
Date: Wed, 28 Apr 2010 15:51:19 -0700 (PDT)
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, 'Keir Fraser' <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Wed, 28 Apr 2010 15:52:32 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20100428140437.GA29653@xxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <C7FD6DD5.10D5B%keir.fraser@xxxxxxxxxxxxx> <4BD7DA02.3030107@xxxxxxxxxxx> <20100428140437.GA29653@xxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acrm29dtMe5YcfNwQTC5vWSTbjfcnAASQt5g
I tried making the change in 
linux-2.6-pvops.git/arch/x86/include/asm/irq_vectors.h

It was:
#define NR_VECTORS                     256
I changed it to
#define NR_VECTORS                       1024

I still get the same number of nr_irqs  (dmesg | grep -i nr_irq) before and 
after the change.

[    0.000000] nr_irqs_gsi: 48
[    0.500076] NR_IRQS:5120 nr_irqs:944

Also, as earlier it crashes on the same number of domU (154). I didn’t 
mention earlier, this a dual core Nehalem machine  -- 2 (sockets) * 4 cores 
per CPU * 2 (hyperthreading)

--Yuvraj

-----Original Message-----
From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx]
Sent: Wednesday, April 28, 2010 7:05 AM
To: John McCullough
Cc: Keir Fraser; xen-devel@xxxxxxxxxxxxxxxxxxx; Yuvraj Agarwal
Subject: Re: [Xen-devel] XEN 4.0 + 2.6.31.13 pvops kernel : system crashes 
on starting 155th domU

On Tue, Apr 27, 2010 at 11:47:30PM -0700, John McCullough wrote:
> I did a little testing.
>
> With no kernel option:
> # dmesg | grep -i nr_irqs
> [    0.000000] nr_irqs_gsi: 88
> [    0.000000] NR_IRQS:4352 nr_irqs:256
>
> w/nr_irqs=65536:
> # dmesg | grep -i nr_irqs
> [    0.000000] Command line: root=/dev/sda1 ro quiet console=hvc0
> nr_irqs=65536
> [    0.000000] nr_irqs_gsi: 88
> [    0.000000] Kernel command line: root=/dev/sda1 ro quiet console=hvc0
> nr_irqs=65536
> [    0.000000] NR_IRQS:4352 nr_irqs:256
>
> tweaking the NR_IRQS macro in the kernel will change the NR_IRQS output,
> but unfortunately that doesn't change nr_irqs and I run into the same
> limit (36 domus on a less-beefy dual core machine).

If you have CONFIG_SPARSE_IRQ defined in your .config, it gets
overwritten by some code that figures out how many IRQs you need based
on your CPU count.

So can you change NR_VECTORS in arch/x86/include/asm/irq_vectors.h to a
higher value and see what happens?

>
> I did find this:
> http://blogs.sun.com/fvdl/entry/a_million_vms
> which references NR_DYNIRQS, which is in 2.6.18, but not in the pvops
> kernel.
>
> Watching /proc/interrupts, the domain irqs seem to be getting allocated
> from 248 downward until they hit some other limit:

Yeah. They hit the nr_irqs_gsi and don't go below that.

> ...
>  64:      59104  xen-pirq-ioapic-level  ioc0
>  89:          1   xen-dyn-event     evtchn:xenconsoled
>  90:          1   xen-dyn-event     evtchn:xenstored
>  91:          6   xen-dyn-event     vif36.0
>  92:        140   xen-dyn-event     blkif-backend
>  93:         97   xen-dyn-event     evtchn:xenconsoled
>  94:        139   xen-dyn-event     evtchn:xenstored
>  95:          7   xen-dyn-event     vif35.0
>  96:        301   xen-dyn-event     blkif-backend
>  97:        261   xen-dyn-event     evtchn:xenconsoled
>  98:        145   xen-dyn-event     evtchn:xenstored
>  99:          7   xen-dyn-event     vif34.0
> ...
> Perhaps the xen irqs are getting allocated out of the nr_irqs pool,
> while they could be allocated from the NR_IRQS pool?
>
> -John
>
>
>
>
> On 04/27/2010 08:45 PM, Keir Fraser wrote:
>> I think nr_irqs is specifiable on the command line on newer kernels. You 
>> may
>> be able to do nr_irqs=65536 as a kernel boot parameter, or something like
>> that, without needing to rebuild the kernel.
>>
>>   -- Keir
>>
>> On 28/04/2010 02:02, "Yuvraj Agarwal"<yuvraj@xxxxxxxxxxx>  wrote:
>>
>>
>>> Actually, I did identify the problem (don’t know the fix) at least from
>>> the console logs. Its related to running out of nr_irq's  (attached JPG
>>> for the console log).
>>>
>>>
>>> -----Original Message-----
>>> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Keir Fraser
>>> Sent: Tuesday, April 27, 2010 5:44 PM
>>> To: Yuvraj Agarwal; xen-devel@xxxxxxxxxxxxxxxxxxx
>>> Subject: Re: [Xen-devel] XEN 4.0 + 2.6.31.13 pvops kernel : system 
>>> crashes
>>> on starting 155th domU
>>>
>>> On 27/04/2010 08:41, "Yuvraj Agarwal"<yuvraj@xxxxxxxxxxx>  wrote:
>>>
>>>
>>>> Attached is the output of /var/log/daemon.log and 
>>>> /var/log/xen/xend.log,
>>>>
>>> but
>>>
>>>> as far as we can see we don¹t quite know what might be going causing 
>>>> the
>>>> system to crash (no console access anymore and system becomes
>>>>
>>> unresponsive and
>>>
>>>> needs to be power-cycled).  I have pasted only the relevant bits of
>>>> information (the last domU that did successfully start and the next one
>>>>
>>> that
>>>
>>>> failed). It may be the case that all the log messages weren¹t flushed
>>>>
>>> before
>>>
>>>> the system crashedŠ
>>>>
>>>> Does anyone know where this limit of 155 domU is coming from and how we
>>>>
>>> can
>>>
>>>> fix/increase it?
>>>>
>>> Get a serial line on a test box, and capture Xen logging output on it. 
>>> You
>>> can both see if any crash messages come from Xen when the 155th domain 
>>> is
>>> created, and also try the serial debug keys (e.g., try 'h' to get help 
>>> to
>>> start with) to see whether Xen itself is still alive.
>>>
>>>   -- Keir
>>>
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>> http://lists.xensource.com/xen-devel
>>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
>>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>