[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] XEN 4.0 + 2.6.31.13 pvops kernel : system crashes on starting 155th domU



On Wed, 2010-04-28 at 15:04 +0100, Konrad Rzeszutek Wilk wrote:
> On Tue, Apr 27, 2010 at 11:47:30PM -0700, John McCullough wrote:
> > I did a little testing.
> >
> > With no kernel option:
> > # dmesg | grep -i nr_irqs
> > [    0.000000] nr_irqs_gsi: 88
> > [    0.000000] NR_IRQS:4352 nr_irqs:256
> >
> > w/nr_irqs=65536:
> > # dmesg | grep -i nr_irqs
> > [    0.000000] Command line: root=/dev/sda1 ro quiet console=hvc0  
> > nr_irqs=65536
> > [    0.000000] nr_irqs_gsi: 88
> > [    0.000000] Kernel command line: root=/dev/sda1 ro quiet console=hvc0  
> > nr_irqs=65536
> > [    0.000000] NR_IRQS:4352 nr_irqs:256
> >
> > tweaking the NR_IRQS macro in the kernel will change the NR_IRQS output,  
> > but unfortunately that doesn't change nr_irqs and I run into the same  
> > limit (36 domus on a less-beefy dual core machine).
> 
> If you have CONFIG_SPARSE_IRQ defined in your .config, it gets
> overwritten by some code that figures out how many IRQs you need based
> on your CPU count.
> 
> So can you change NR_VECTORS in arch/x86/include/asm/irq_vectors.h to a
> higher value and see what happens?

Jeremy applied a patch of mine which added some extra space for dynamic
IRQs at the start of march:
        commit 6d4a9168207ade237098a401270959ecc0bdd1e9
        Author: Ian Campbell <ian.campbell@xxxxxxxxxx>
        Date:   Mon Mar 1 11:21:15 2010 +0000
        
            xen: allow some overhead in IRQ space for dynamic IRQs
        
If you have this patch then you can edit NR_DYNAMIC_IRQS in
arch/x86/include/asm/irq_vectors.h to increase the number of extra IRQs.

Ian.

> 
> >
> > I did find this:
> > http://blogs.sun.com/fvdl/entry/a_million_vms
> > which references NR_DYNIRQS, which is in 2.6.18, but not in the pvops  
> > kernel.
> >
> > Watching /proc/interrupts, the domain irqs seem to be getting allocated  
> > from 248 downward until they hit some other limit:
> 
> Yeah. They hit the nr_irqs_gsi and don't go below that.
> 
> > ...
> >  64:      59104  xen-pirq-ioapic-level  ioc0
> >  89:          1   xen-dyn-event     evtchn:xenconsoled
> >  90:          1   xen-dyn-event     evtchn:xenstored
> >  91:          6   xen-dyn-event     vif36.0
> >  92:        140   xen-dyn-event     blkif-backend
> >  93:         97   xen-dyn-event     evtchn:xenconsoled
> >  94:        139   xen-dyn-event     evtchn:xenstored
> >  95:          7   xen-dyn-event     vif35.0
> >  96:        301   xen-dyn-event     blkif-backend
> >  97:        261   xen-dyn-event     evtchn:xenconsoled
> >  98:        145   xen-dyn-event     evtchn:xenstored
> >  99:          7   xen-dyn-event     vif34.0
> > ...
> > Perhaps the xen irqs are getting allocated out of the nr_irqs pool,  
> > while they could be allocated from the NR_IRQS pool?
> >
> > -John
> >
> >
> >
> >
> > On 04/27/2010 08:45 PM, Keir Fraser wrote:
> >> I think nr_irqs is specifiable on the command line on newer kernels. You 
> >> may
> >> be able to do nr_irqs=65536 as a kernel boot parameter, or something like
> >> that, without needing to rebuild the kernel.
> >>
> >>   -- Keir
> >>
> >> On 28/04/2010 02:02, "Yuvraj Agarwal"<yuvraj@xxxxxxxxxxx>  wrote:
> >>
> >>    
> >>> Actually, I did identify the problem (donât know the fix) at least from
> >>> the console logs. Its related to running out of nr_irq's  (attached JPG
> >>> for the console log).
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> >>> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Keir Fraser
> >>> Sent: Tuesday, April 27, 2010 5:44 PM
> >>> To: Yuvraj Agarwal; xen-devel@xxxxxxxxxxxxxxxxxxx
> >>> Subject: Re: [Xen-devel] XEN 4.0 + 2.6.31.13 pvops kernel : system crashes
> >>> on starting 155th domU
> >>>
> >>> On 27/04/2010 08:41, "Yuvraj Agarwal"<yuvraj@xxxxxxxxxxx>  wrote:
> >>>
> >>>      
> >>>> Attached is the output of /var/log/daemon.log and /var/log/xen/xend.log,
> >>>>        
> >>> but
> >>>      
> >>>> as far as we can see we donÂt quite know what might be going causing the
> >>>> system to crash (no console access anymore and system becomes
> >>>>        
> >>> unresponsive and
> >>>      
> >>>> needs to be power-cycled).  I have pasted only the relevant bits of
> >>>> information (the last domU that did successfully start and the next one
> >>>>        
> >>> that
> >>>      
> >>>> failed). It may be the case that all the log messages werenÂt flushed
> >>>>        
> >>> before
> >>>      
> >>>> the system crashedÅ
> >>>>
> >>>> Does anyone know where this limit of 155 domU is coming from and how we
> >>>>        
> >>> can
> >>>      
> >>>> fix/increase it?
> >>>>        
> >>> Get a serial line on a test box, and capture Xen logging output on it. You
> >>> can both see if any crash messages come from Xen when the 155th domain is
> >>> created, and also try the serial debug keys (e.g., try 'h' to get help to
> >>> start with) to see whether Xen itself is still alive.
> >>>
> >>>   -- Keir
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Xen-devel mailing list
> >>> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >>> http://lists.xensource.com/xen-devel
> >>>      
> >>
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >> http://lists.xensource.com/xen-devel
> >>    
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.