WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] irqbalance seg faults with 2.6.38 or later kernels [patc

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] irqbalance seg faults with 2.6.38 or later kernels [patch + solution included] if running under Xen hypervisor
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Wed, 11 May 2011 09:10:58 -0400
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, crrodriguez@xxxxxxx, nhorman@xxxxxxxxxxxxx, bwalle@xxxxxxx, pbrobinson@xxxxxxxxx, notting@xxxxxxxxxx, arjan@xxxxxxxxxxxxxxx, anibal@xxxxxxxxxx
Delivery-date: Wed, 11 May 2011 06:14:23 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4DCA62150200007800040EEA@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20110511003347.GA29851@xxxxxxxxxxxx> <4DCA62150200007800040EEA@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, May 11, 2011 at 09:16:53AM +0100, Jan Beulich wrote:
> >>> On 11.05.11 at 02:33, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> 
> >>> wrote:
> > The reason behind it is that irqbalance parses the /proc/interrupts
> > and whenever it hits something it can't understand:
> > 
> >  RES:  191614137   73904910    Rescheduling interrupts
> > 
> > It will count the number of interrupts towards the IRQ 0. That IRQ does 
> > exist
> > when the kernel boots under baremetal:
> > 
> >   0:         46          0       IO-APIC-edge      timer
> > 
> > but under Xen, the timer interrupts are initialized much later:
> > 
> >  272:   41197188          0        xen-percpu-virq      timer0
> > 
> > and the first IRQ that is used is not zero, but rather one:
> > 
> >    1:      73037          0          0          0          0          0  
> > xen-pirq-ioapic-edge  i8042
> > 
> > so when irqbalance tries to account for the IRQ 'RES' to the IRQ 0
> > it fails and segfaults. The attached patch fixes it for whoever else is
> > hitting this problem.
> 
> In the svn snapshot I have, I see
> 
>               /* lines with letters in front are special, like NMI count. 
> Ignore */
>               if (!(line[0]==' ' || (line[0]>='0' && line[0]<='9')))
>                       break;
> 
> which I would think should be taking care of your problem (or
> I mis-read your description), and which was there already before

Not anymore. In kernels 2.6.37:

           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       
.. snip.
NMI:          0          0          0          0   Non-maskable interrupts
LOC:   12413629   12858323   16296183   11098466   Local timer interrupts

In 2.6.38 and later:
            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       
 TRM:          0          0          0          0          0          0   
Thermal event interrupts
 THR:          0          0          0          0          0          0   
Threshold APIC interrupts
 MCE:          0          0          0          0          0          0   
Machine check exceptions

They added in a space before the name. The check you mentioned
above could be augmented for this of course, as another solution
for this.

> 0.56. Or are you perhaps having the problem because you have
> 1000+ interrupts, thus causing even the non-numeric strings to
> get space padded on their left? In that case I'd rather think above
> check should be either improved or removed (replaced by your
> solution).
> 
> > I am not sure who the upstream maintainer is for this so
> > I am sending this patch to the different distros as well.
> 
> Copying Neil and Arjan.
> 
> Jan
> 
> > 
> > --- irqbalance-0.56.orig/procinterrupts.c   2010-06-10 10:45:55.000000000 
> > -0400
> > +++ irqbalance-0.56/procinterrupts.c        2011-05-10 20:22:06.897465003 
> > -0400
> > @@ -50,7 +50,7 @@ void parse_proc_interrupts(void)
> >             int cpunr;
> >             int      number;
> >             uint64_t count;
> > -           char *c, *c2;
> > +           char *c, *c2, *err;
> >  
> >             if (getline(&line, &size, file)==0)
> >                     break;
> > @@ -64,7 +64,11 @@ void parse_proc_interrupts(void)
> >                     continue;
> >             *c = 0;
> >             c++;
> > -           number = strtoul(line, NULL, 10);
> > +           number = strtoul(line, &err, 10);
> > +           /* Man page says that if that happens and number == 0, then it
> > +            * failed to parse. */
> > +           if (err == line && number == 0)
> > +                   continue;
> >             count = 0;
> >             cpunr = 0;
> >  
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel