WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] grant table unmap failure makes guest unreapable and causes

To: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] grant table unmap failure makes guest unreapable and causes xen oops
From: Kip Macy <kip.macy@xxxxxxxxx>
Date: Wed, 6 Jul 2005 11:08:56 -0700
Delivery-date: Wed, 06 Jul 2005 18:07:44 +0000
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=ZIK9G6FOviRRwxeSn0lTKyT/65u/EH33Ud39eywGxMiR2zSMtaC5IyPcdbFMLXMnJvi4JD5ok0J48UwhFIt7TRLMHxD5ALUdVKNl9gU4n1mq+CNmF1L6WXA1TqGb84Nrau4uaXN/4XmC/k5bGP9G3sZLy2lh/+RKeIluUx5MnLs=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Reply-to: Kip Macy <kip.macy@xxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
I just hit this so I don't fully understand it yet, but it looks like
there may be some race condition with grant_table unmap requests and
garbage collection of domain memory on crashed guests.

My centos4 domU isn't finding its init (this may be the breakage in
file-backed VBDs that Mark mentioned - it was finding it a couple of
days ago) and thus calls HYPERVISOR_crash:

Freeing unused kernel memory: 92k freed
Kernel panic - not syncing: No init found.  Try passing init= option to kernel.

[root@rs0 ~]# xm list
Name              Id  Mem(MB)  CPU VCPU(s)  State  Time(s)  Console
Domain-0           0      251            0         1       r----     1013.3
rhel4_0               1          0            3         1        ----c
         0.5     9601

The following errors show up on the console:

(XEN) (file=grant_table.c, line=500) Bad handle (0).
(XEN) (file=grant_table.c, line=500) Bad handle (49152).
(XEN) (file=grant_table.c, line=500) Bad handle (49792).
(XEN) (file=grant_table.c, line=500) Bad handle (0).
(XEN) (file=grant_table.c, line=500) Bad handle (61440).

And the guest never goes away.

[root@rs0 ~]# xm destroy 1
[root@rs0 ~]# xm list
Name              Id  Mem(MB)  CPU VCPU(s)  State  Time(s)  Console
Domain-0          0          251        0             1     r----   1208.2
rhel4_0               1             0        3              1    
----c        0.5     9601

restarting xend here is interesting:

[root@rs0 ~]# xend start
DBMap>introduceDomain> 1 69067 <EventChannel dom1:0:14 dom2:1:2>
/domain/4042ebcc-778d-4488-a0bd-6152c42ba98b
Traceback (most recent call last):
<snip>
RuntimeError: (9, 'Bad file descriptor')

Message from syslogd@rs0 at Wed Jul  6 10:59:17 2005 ...
rs0 xenstored: xenstored corruption: connection id 0: err Bad address:
Unknown error 14 (Bad address)
Exception starting xend: (9, 'Bad file descriptor')

On the console we see:
(XEN) (file=/build/kmacy/xen/xen-unstable.hg/xen/include/asm/mm.h,
line=187) Error pfn be9: rd=ffbf8a80, od=ffbf8a80, caf=00000000,
taf=f0000001
(XEN) (file=/build/kmacy/xen/xen-unstable.hg/xen/include/asm/mm.h,
line=187) Error pfn 10dcb: rd=ffbf8a80, od=00000000, caf=00000000,
taf=f0000000
[ERR] corruptxenstored corruption: connection id 0: err Bad address:
Unknown error 14 (Bad address)

*NOW* comes the fun part:
[root@rs0 ~]# /sbin/shutdown -r now

Broadcast message from root (pts/1) (Wed Jul  6 11:01:00 2005):

The system is going down for reboot NOW!
INIT: Sending processes the TERM signal
(XEN) CPU:    0
(XEN) EIP:    e008:[<ff10b882>]
(XEN) EFLAGS: 00210202   CONTEXT: hypervisor
(XEN) eax: 0000000a   ebx: 00000000   ecx: 00000000   edx: 00000003
(XEN) esi: 00000001   edi: ffbf2700   ebp: ffbf1004   esp: ff103e04
(XEN) cr0: 8005003b   cr3: 181cd000
(XEN) ds: e010   es: e010   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen stack trace from esp=ff103e04:
(XEN)    00000001 00000052 00000000 00000400 fec6b000 ff1a9900 fc400000 00000f00
(XEN)    00000000 00000000 00000001 000a0067 fc400f00 ff1a9900
ff1a7080 [ff1269de]
(XEN)    ff1a7080 ff1a9900 000000a0 00000000 fec6b000 32ff0001 ff1a9900 0000067c
(XEN)    00000261 000a0067 fec72000 [ff12a110] 000a0067 ff1a9900
00000000 [ff13b9ef]
(XEN)    181cd000 ff103fb4 00000001 c7910985 ff1a9b80 fec72000
ff1a9900 [ff12a266]
(XEN)    ff1a9900 fec72000 ff1b2000 [ff12ba14] fec71000 ff1a9b80
ff103fb4 ff1a9b90
(XEN)    c7910985 fe31e440 00000000 00000000 00000000 00000000
ff1a9900 [ff12d55e]
(XEN)    ff1a9900 00000000 0000000c 00200286 ff103fb4 ff1a9900
[ff13b9ef] 181cd000
(XEN)    001446c9 00000000 c7910984 ff1a9900 00018ef0 18ef0061
[ff12eb6a] 00018ef0
(XEN)    ffffffff 00000010 ff1a9900 00000007 c873e000 00010000 c0568ee0 000002db
(XEN)    32db0001 ff103fb4 ff1a9b80 ff1a9900 00000000 fe3f8b6c ff1a9900 ff103fb4
(XEN)    c7910984 ffbf3080 [ff13e867] 00000000 00000000 00000000
00000000 00000001
(XEN)    00000005 00000020 ee000000 ffbf3080 ffbf3bf8 ffbf3080 ffbf3080 ffbf3080
(XEN)    00007ff0 c8623284 b6e69000 [ff14a8f3] c8683eec 00000001
00000000 00007ff0
(XEN)    c8623284 b6e69000 0000001a 000e0003 c0115b33 00000061 00200282 c8683eec
(XEN)    00000069 0000007b 0000007b 00000000 00000000 00000000 ffbf3080
(XEN) Xen call trace from esp=ff103e04:
(XEN)    [<ff1269de>] [<ff12a110>] [<ff13b9ef>] [<ff12a266>]
[<ff12ba14>] [<ff12d55e>]
(XEN)    [<ff13b9ef>] [<ff12eb6a>] [<ff13e867>] [<ff14a8f3>]

****************************************
Panic on CPU0:
CPU0 FATAL PAGE FAULT
[error_code=0000]
Faulting linear address: 00000004
****************************************

Reboot in five seconds...


Line 910 of "grant_table.c" starts at address 0xff10b87f
<gnttab_check_unmap+175>
   and ends at 0xff10b888 <gnttab_check_unmap+184>.
<...>
             ( readonly ? 1 : (!(map->ref_and_flags & GNTMAP_readonly))))
        {
            ref = (map->ref_and_flags >> MAPTRACK_REF_SHIFT);
            act = &rgt->active[ref];      <- line 910
 
            spin_lock(&rgt->lock);

            if ( act->frame != frame )
<...>
0xff10b882 <gnttab_check_unmap+178>:    mov    0x4(%ecx),%eax
0xff1269de <put_page_from_l1e+270>:     test   %eax,%eax
0xff12a110 <revalidate_l1+176>: jmp    0xff12a090 <revalidate_l1+48>
0xff13b9ef <__flush_tlb_mask+239>:      mov    0x44(%ebx),%eax
0xff12a266 <ptwr_flush+246>:    mov    %edi,(%esp)
0xff12d55e <do_mmuext_op+1150>: jmp    0xff12d17a <do_mmuext_op+154>
0xff13b9ef <__flush_tlb_mask+239>:      mov    0x44(%ebx),%eax
0xff12eb6a <ptwr_do_page_fault+506>:    mov    %eax,%esi
0xff13e867 <do_page_fault+423>: test   %eax,%eax
0xff14a8f3 <hypercall+83>:      mov    %eax,0x18(%esp)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>