WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: grant table unmap failure makes guest unreapable and cau

To: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] Re: grant table unmap failure makes guest unreapable and causes xen oops
From: Kip Macy <kip.macy@xxxxxxxxx>
Date: Wed, 6 Jul 2005 11:18:07 -0700
Delivery-date: Wed, 06 Jul 2005 18:16:56 +0000
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=MfNuEOYveHuL52NCnXgDHvKVwbZw2erIhPA0fNTC8VZ1k/ZUQyW/WbN3YVABVa+OzgSFMq/qcbkpM1Fi4Cn9jV5V1mcQCoiKHSv5791nT1mXr+vl5NKeqP+dMz/VFcoorKoYw1qw8fNfSX7n3PMQuPUH82qgXxcGMxzEQsVD+5I=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <b1fa2917050706110836e247e3@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <b1fa2917050706110836e247e3@xxxxxxxxxxxxxx>
Reply-to: Kip Macy <kip.macy@xxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
It looks like some of these problems may have been fixed by check-ins
in the last couple of hours. I'm doing a make world right now.

 -Kip

         

On 7/6/05, Kip Macy <kip.macy@xxxxxxxxx> wrote:
> I just hit this so I don't fully understand it yet, but it looks like
> there may be some race condition with grant_table unmap requests and
> garbage collection of domain memory on crashed guests.
> 
> My centos4 domU isn't finding its init (this may be the breakage in
> file-backed VBDs that Mark mentioned - it was finding it a couple of
> days ago) and thus calls HYPERVISOR_crash:
> 
> Freeing unused kernel memory: 92k freed
> Kernel panic - not syncing: No init found.  Try passing init= option to 
> kernel.
> 
> [root@rs0 ~]# xm list
> Name              Id  Mem(MB)  CPU VCPU(s)  State  Time(s)  Console
> Domain-0           0      251            0         1       r----     1013.3
> rhel4_0               1          0            3         1        ----c
>          0.5     9601
> 
> The following errors show up on the console:
> 
> (XEN) (file=grant_table.c, line=500) Bad handle (0).
> (XEN) (file=grant_table.c, line=500) Bad handle (49152).
> (XEN) (file=grant_table.c, line=500) Bad handle (49792).
> (XEN) (file=grant_table.c, line=500) Bad handle (0).
> (XEN) (file=grant_table.c, line=500) Bad handle (61440).
> 
> And the guest never goes away.
> 
> [root@rs0 ~]# xm destroy 1
> [root@rs0 ~]# xm list
> Name              Id  Mem(MB)  CPU VCPU(s)  State  Time(s)  Console
> Domain-0          0          251        0             1     r----   1208.2
> rhel4_0               1             0        3              1
> ----c        0.5     9601
> 
> restarting xend here is interesting:
> 
> [root@rs0 ~]# xend start
> DBMap>introduceDomain> 1 69067 <EventChannel dom1:0:14 dom2:1:2>
> /domain/4042ebcc-778d-4488-a0bd-6152c42ba98b
> Traceback (most recent call last):
> <snip>
> RuntimeError: (9, 'Bad file descriptor')
> 
> Message from syslogd@rs0 at Wed Jul  6 10:59:17 2005 ...
> rs0 xenstored: xenstored corruption: connection id 0: err Bad address:
> Unknown error 14 (Bad address)
> Exception starting xend: (9, 'Bad file descriptor')
> 
> On the console we see:
> (XEN) (file=/build/kmacy/xen/xen-unstable.hg/xen/include/asm/mm.h,
> line=187) Error pfn be9: rd=ffbf8a80, od=ffbf8a80, caf=00000000,
> taf=f0000001
> (XEN) (file=/build/kmacy/xen/xen-unstable.hg/xen/include/asm/mm.h,
> line=187) Error pfn 10dcb: rd=ffbf8a80, od=00000000, caf=00000000,
> taf=f0000000
> [ERR] corruptxenstored corruption: connection id 0: err Bad address:
> Unknown error 14 (Bad address)
> 
> *NOW* comes the fun part:
> [root@rs0 ~]# /sbin/shutdown -r now
> 
> Broadcast message from root (pts/1) (Wed Jul  6 11:01:00 2005):
> 
> The system is going down for reboot NOW!
> INIT: Sending processes the TERM signal
> (XEN) CPU:    0
> (XEN) EIP:    e008:[<ff10b882>]
> (XEN) EFLAGS: 00210202   CONTEXT: hypervisor
> (XEN) eax: 0000000a   ebx: 00000000   ecx: 00000000   edx: 00000003
> (XEN) esi: 00000001   edi: ffbf2700   ebp: ffbf1004   esp: ff103e04
> (XEN) cr0: 8005003b   cr3: 181cd000
> (XEN) ds: e010   es: e010   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) Xen stack trace from esp=ff103e04:
> (XEN)    00000001 00000052 00000000 00000400 fec6b000 ff1a9900 fc400000 
> 00000f00
> (XEN)    00000000 00000000 00000001 000a0067 fc400f00 ff1a9900
> ff1a7080 [ff1269de]
> (XEN)    ff1a7080 ff1a9900 000000a0 00000000 fec6b000 32ff0001 ff1a9900 
> 0000067c
> (XEN)    00000261 000a0067 fec72000 [ff12a110] 000a0067 ff1a9900
> 00000000 [ff13b9ef]
> (XEN)    181cd000 ff103fb4 00000001 c7910985 ff1a9b80 fec72000
> ff1a9900 [ff12a266]
> (XEN)    ff1a9900 fec72000 ff1b2000 [ff12ba14] fec71000 ff1a9b80
> ff103fb4 ff1a9b90
> (XEN)    c7910985 fe31e440 00000000 00000000 00000000 00000000
> ff1a9900 [ff12d55e]
> (XEN)    ff1a9900 00000000 0000000c 00200286 ff103fb4 ff1a9900
> [ff13b9ef] 181cd000
> (XEN)    001446c9 00000000 c7910984 ff1a9900 00018ef0 18ef0061
> [ff12eb6a] 00018ef0
> (XEN)    ffffffff 00000010 ff1a9900 00000007 c873e000 00010000 c0568ee0 
> 000002db
> (XEN)    32db0001 ff103fb4 ff1a9b80 ff1a9900 00000000 fe3f8b6c ff1a9900 
> ff103fb4
> (XEN)    c7910984 ffbf3080 [ff13e867] 00000000 00000000 00000000
> 00000000 00000001
> (XEN)    00000005 00000020 ee000000 ffbf3080 ffbf3bf8 ffbf3080 ffbf3080 
> ffbf3080
> (XEN)    00007ff0 c8623284 b6e69000 [ff14a8f3] c8683eec 00000001
> 00000000 00007ff0
> (XEN)    c8623284 b6e69000 0000001a 000e0003 c0115b33 00000061 00200282 
> c8683eec
> (XEN)    00000069 0000007b 0000007b 00000000 00000000 00000000 ffbf3080
> (XEN) Xen call trace from esp=ff103e04:
> (XEN)    [<ff1269de>] [<ff12a110>] [<ff13b9ef>] [<ff12a266>]
> [<ff12ba14>] [<ff12d55e>]
> (XEN)    [<ff13b9ef>] [<ff12eb6a>] [<ff13e867>] [<ff14a8f3>]
> 
> ****************************************
> Panic on CPU0:
> CPU0 FATAL PAGE FAULT
> [error_code=0000]
> Faulting linear address: 00000004
> ****************************************
> 
> Reboot in five seconds...
> 
> 
> Line 910 of "grant_table.c" starts at address 0xff10b87f
> <gnttab_check_unmap+175>
>    and ends at 0xff10b888 <gnttab_check_unmap+184>.
> <...>
>              ( readonly ? 1 : (!(map->ref_and_flags & GNTMAP_readonly))))
>         {
>             ref = (map->ref_and_flags >> MAPTRACK_REF_SHIFT);
>             act = &rgt->active[ref];      <- line 910
> 
>             spin_lock(&rgt->lock);
> 
>             if ( act->frame != frame )
> <...>
> 0xff10b882 <gnttab_check_unmap+178>:    mov    0x4(%ecx),%eax
> 0xff1269de <put_page_from_l1e+270>:     test   %eax,%eax
> 0xff12a110 <revalidate_l1+176>: jmp    0xff12a090 <revalidate_l1+48>
> 0xff13b9ef <__flush_tlb_mask+239>:      mov    0x44(%ebx),%eax
> 0xff12a266 <ptwr_flush+246>:    mov    %edi,(%esp)
> 0xff12d55e <do_mmuext_op+1150>: jmp    0xff12d17a <do_mmuext_op+154>
> 0xff13b9ef <__flush_tlb_mask+239>:      mov    0x44(%ebx),%eax
> 0xff12eb6a <ptwr_do_page_fault+506>:    mov    %eax,%esi
> 0xff13e867 <do_page_fault+423>: test   %eax,%eax
> 0xff14a8f3 <hypercall+83>:      mov    %eax,0x18(%esp)
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>