| 
         
xen-devel
[Xen-devel] RE: kernel BUG at arch/x86/xen/mmu.c:1872
 
| 
To:  | 
<giamteckchoon@xxxxxxxxx> | 
 
| 
Subject:  | 
[Xen-devel] RE: kernel BUG at arch/x86/xen/mmu.c:1872 | 
 
| 
From:  | 
MaoXiaoyun <tinnycloud@xxxxxxxxxxx> | 
 
| 
Date:  | 
Tue, 12 Apr 2011 11:30:32 +0800 | 
 
| 
Cc:  | 
jeremy@xxxxxxxx, xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, keir@xxxxxxx,	ian.campbell@xxxxxxxxxx, konrad.wilk@xxxxxxxxxx, dave@xxxxxxxxxx | 
 
| 
Delivery-date:  | 
Mon, 11 Apr 2011 20:31:23 -0700 | 
 
| 
Envelope-to:  | 
www-data@xxxxxxxxxxxxxxxxxxx | 
 
| 
Importance:  | 
Normal | 
 
| 
In-reply-to:  | 
<BANLkTimKWanNYTjrEHaik_Z1eLXptoN1kQ@xxxxxxxxxxxxxx> | 
 
| 
List-help:  | 
<mailto:xen-devel-request@lists.xensource.com?subject=help> | 
 
| 
List-id:  | 
Xen developer discussion <xen-devel.lists.xensource.com> | 
 
| 
List-post:  | 
<mailto:xen-devel@lists.xensource.com> | 
 
| 
List-subscribe:  | 
<http://lists.xensource.com/mailman/listinfo/xen-devel>,	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe> | 
 
| 
List-unsubscribe:  | 
<http://lists.xensource.com/mailman/listinfo/xen-devel>,	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe> | 
 
| 
References:  | 
<COL0-MC1-F14hmBzxHs00230882@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>,	<BLU157-w488E5FEBD5E2DBC0666EF1DAA70@xxxxxxx>,	<BLU157-w5025BFBB4B1CDFA7AA0966DAA90@xxxxxxx>,	<BLU157-w540B39FBA137B4D96278D2DAA90@xxxxxxx>,	<BANLkTimgh_iip27zkDPNV9r7miwbxHmdVg@xxxxxxxxxxxxxx>,	<BANLkTimkMgYNyANcKiZu5tJTL4==zdP3xg@xxxxxxxxxxxxxx>,	<BLU157-w116F1BB57ABFDE535C7851DAA80@xxxxxxx>,	<BANLkTimKWanNYTjrEHaik_Z1eLXptoN1kQ@xxxxxxxxxxxxxx> | 
 
| 
Sender:  | 
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx | 
 
 
 
Hi: 
  
       I have just kicked off cpuidle=0 "cpufreq=none" tests. 
  
       What is your Xen version?  Do you use the backend driver of 2.6.32.36? 
  
       Beside the "TLB BUG ", I've met at least two other issues 
       1)Xen4.0.1 + 2.6.32.36 kernel + backend driver from 2.6.31  ==> will cause "Bad grant reference " log in serial output 
       2)Xen4.0.1 + 2.6.32.36 kernel with its owen backend driver   ==> will cause disk error like belows. 
  
sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to&n
 bsp;offline device end_request: I/O error, dev tdb, sector 28699593 end_request: I/O error, dev tdb, sector 28699673 end_request: I/O error, dev tdb, sector 28699753 end_request: I/O error, dev tdb, sector 28699833 end_request: I/O error, dev tdb, sector 28699913 end_request: I/O error, dev tdb, sector 28699993 end_request: I/O error, dev tdb, sector 28700073 
          thanks. 
  
  
> Date: Mon, 11 Apr 2011 23:25:19 +0800 > Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 > From: giamteckchoon@xxxxxxxxx > To: tinnycloud@xxxxxxxxxxx > CC: xen-devel@xxxxxxxxxxxxxxxxxxx; dave@xxxxxxxxxx; ian.campbell@xxxxxxxxxx; konrad.wilk@xxxxxxxxxx; jeremy@xxxxxxxx; keir@xxxxxxx >  > 2011/4/11 MaoXiaoyun <tinnycloud@xxxxxxxxxxx>: > > Hi: > > > >      I believe this is the fix at much extent. > >      Since I have my own test cases which with this patch, my test case will > > success in 30 rounds run. > >      Every round takes 8hours.  While without this patch, tests fail evey > > round in 15minutes. > > > >       So this really means fix most of the things. > > > >       But during running, I met another
  crash, from the log it it looks like > > has relation with > > this BUG, since the crash log shows it is tlb related and this BUG also tlb > > related. >  > Are you able to run another test with cpuidle=0 cpufreq=none in kernel > boot option? Just curious whether can you reproduce the tlb bug when > you boot with cpuidle=0 cpufreq=none... ... >  > > > >       Well, I'm also have poor knowledge of kernel. > >       Hope someone from Xen Devel offer some help. > > > >       Many thanks. > > > >> Date: Mon, 11 Apr 2011 20:16:53 +0800 > >> Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 > >> From: giamteckchoon@xxxxxxxxx > >> To: tinnycloud@xxxxxxxxxxx > >> CC: xen-devel@xxxxxxxxxxxxxxxxxxx; dave@xxxxxxxxxx; > >>
  ian.campbell@xxxxxxxxxx; konrad.wilk@xxxxxxxxxx; jeremy@xxxxxxxx; > >> keir@xxxxxxx > >> > >> > > >> > Hi, > >> > > >> > Sorry, since this mmu related BUG has been troubled me for very > >> > long... I really want to "kill" this BUG but my knowledge in kernel > >> > hacking and/or xen is very limited. > >> > > >> > While waiting for Jeremy or Konrad or others ... > >> > > >> > Many thanks for spending time to track down this mmu related BUG.  I > >> > have backported the commit from > >> > > >> > http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=64141da587241301ce8638cc945f8b67853156ec > >> > to 2.6.32.36 PVOPS kernel and patch attached.  I won't know whether > >> > did I backport it correctly nor does 
 it affects anything.  I am > >> > currently testing the 2.6.32.36 PVOPS kernel with this patch applied > >> > and also unset CONFIG_DEBUG_PAGEALLOC.  Currently running testcrash.sh > >> > loop 1000 as I am unable to reproduce this mmu BUG 1872 in > >> > testcrash.sh loop 100.  Please note that when CONFIG_DEBUG_PAGEALLOC > >> > is unset, I can reproduce this mmu BUG 1872 easily within <50 > >> > testcrash.sh loop cycle with PVOPS version 2.6.32.24 to 2.6.32.36 > >> > kernel.  Now test with this backport patch to see whether I can > >> > reproduce this mmu BUG... ... > >> > > >> > Kindest regards, > >> > Giam Teck Choon > >> > > >> > >> I have tested with my backport patch and it is working fine as I am > >> unable to reproduce the mmu.c 1872
  or 1860 bug with > >> CONFIG_DEBUG_PAGEALLOC not set. I tested with testcrash.sh loop 100 > >> and 1000. Now doing testcrash.sh loop 10000. > >> > >> Xiaoyun, is it possible for you to test my patch and see whether can > >> you reproduce the mmu.c 1872/1860 bug? > >> > >> Can anyone of you review my patch? > >> > >> I will post a format patch according to > >> Documentation/SubmittingPatches in my next reply and hopefully can be > >> reviewed. > >> > >> Thanks. > >> > >> Kindest regards, > >> Giam Teck Choon > >  		 	   		  
 |  
 _______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
 
 |   
 
 | 
    |