WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] xen: mm.c MFN errors

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] xen: mm.c MFN errors
From: Shriram Rajagopalan <rshriram@xxxxxxxxx>
Date: Thu, 24 Feb 2011 14:01:15 -0800
Delivery-date: Thu, 24 Feb 2011 14:03:08 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Reply-to: rshriram@xxxxxxxxx
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
I had this problem in 4.0.1 (still not resolved) and it persists in
4.1.0-rc6-pre.
 And I am not the only one facing this issue apparently.
http://lists.xensource.com/archives/html/xen-users/2011-02/msg00362.html
 also reports the same issue, on xen 4.0.2-rc2

My workload was simple 2.6.18 domU (512M) with just 2 threads constantly
mallocing, touching and freeing memory.

I enabled remus on the domain (just memory replication) which basically
exercises xc_domain_save/xc_domain_restore paths.

Issue 1:
 On primary during replication, xm dmesg logs are flooded with messages like
........
(XEN) mm.c:889:d0 Error getting mfn 468900 (pfn 1fdd1) from L1 entry
8000000468900625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 4688fd (pfn 1fdd4) from L1 entry
80000004688fd625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 4688f8 (pfn 1fdd9) from L1 entry
80000004688f8625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 46889f (pfn 1fe32) from L1 entry
800000046889f625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 46888c (pfn 1fe45) from L1 entry
800000046888c625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 468877 (pfn 1fe5a) from L1 entry
8000000468877625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 468876 (pfn 1fe5b) from L1 entry
8000000468876625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 468825 (pfn 1feac) from L1 entry
8000000468825625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 468824 (pfn 1fead) from L1 entry
8000000468824625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 468820 (pfn 1feb1) from L1 entry
8000000468820625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 46881c (pfn 1feb5) from L1 entry
800000046881c625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 46881b (pfn 1feb6) from L1 entry
800000046881b625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 46881a (pfn 1feb7) from L1 entry
800000046881a625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 468817 (pfn 1feba) from L1 entry
8000000468817625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 4687ec (pfn 1fee5) from L1 entry
80000004687ec625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 4687e7 (pfn 1feea) from L1 entry
80000004687e7625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 4687c8 (pfn 1ff09) from L1 entry
80000004687c8625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 4687a9 (pfn 1ff28) from L1 entry
80000004687a9625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 468799 (pfn 1ff38) from L1 entry
8000000468799625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 468798 (pfn 1ff39) from L1 entry
8000000468798625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 468791 (pfn 1ff40) from L1 entry
8000000468791625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 468790 (pfn 1ff41) from L1 entry
8000000468790625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 46878d (pfn 1ff44) from L1 entry
800000046878d625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 46872d (pfn 1ffa4) from L1 entry
800000046872d625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 46870d (pfn 1ffc4) from L1 entry
800000046870d625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 4686fe (pfn 1ffd3) from L1 entry
80000004686fe625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 4686e3 (pfn 1ffee) from L1 entry
80000004686e3625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 4686dd (pfn 1fff4) from L1 entry
80000004686dd625 for l1e_owner=0, pg_owner=17
(XEN) mm.c:889:d0 Error getting mfn 4686dc (pfn 1fff5) from L1 entry
80000004686dc625 for l1e_owner=0, pg_owner=17
...........

Issue 2:
 VM fails to recover on secondary when I destroy it on primary. xm
dmesg on secondary again shows issues wrt pagetable pinning
(XEN) mm.c:802:d0 Bad L1 flags 400010
(XEN) mm.c:1204:d0 Failure in alloc_l1_table: entry 16
(XEN) mm.c:2142:d0 Error while validating mfn 4229c1 (pfn 1bf44) for
type 1000000000000000: caf=8000000000000002 taf=1000000000000001
(XEN) mm.c:897:d0 Attempt to create linear p.t. with write perms
(XEN) mm.c:1348:d0 Failure in alloc_l2_table: entry 433
(XEN) mm.c:2142:d0 Error while validating mfn 421639 (pfn 1e654) for
type 2000000000000000: caf=8000000000000002 taf=2000000000000001
(XEN) mm.c:1458:d0 Failure in alloc_l3_table: entry 1
(XEN) mm.c:2142:d0 Error while validating mfn 44d975 (pfn 1e62d) for
type 3000000000000000: caf=8000000000000002 taf=3000000000000001
(XEN) mm.c:2965:d0 Error while pinning mfn 44d975

and xend.log on target machine shows
[2011-02-24 13:25:25 2868] DEBUG (XendCheckpoint:278)
restore:shadow=0x0, _static_max=0x20000000, _static_min=0x0,
[2011-02-24 13:25:25 2868] DEBUG (XendCheckpoint:305) [xc_restore]:
/usr/lib/xen/bin/xc_restore 16 9 1 2 0 0 0 0
[2011-02-24 13:28:14 2868] INFO (XendCheckpoint:423) xc: error:
0-length read: Internal error
[2011-02-24 13:28:14 2868] INFO (XendCheckpoint:423) xc: error:
read_exact_timed failed (read rc: 0, errno: 0): Internal error
[2011-02-24 13:28:14 2868] INFO (XendCheckpoint:423) xc: error: Error
when reading ctxt (0 = Success): Internal error
[2011-02-24 13:28:14 2868] INFO (XendCheckpoint:423) xc: error: error
buffering image tail, finishing: Internal error
[2011-02-24 13:28:14 2868] INFO (XendCheckpoint:423) xc: error: Failed
to pin batch of 18 page tables (22 = Invalid argument): Internal error


I wager this has got something to do either with the
canonicalization/uncanonicalization code but cannot pin point
where exactly, atm.
shriram

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>