xen-devel
Re: [Xen-devel] shadow2 corrupting PV guest state
Hi,
You (jeremy) said:
> I've been fighting random crashes in the paravirt tree for a while.
> After a fair amount of head-banging, it looks to me like the shadow2
> code is trashing the guest stack (and maybe register state) at random
> points.
I have a question about shadow2 in another point of view.
I've been porting PV-on-HVM driver for ia64 platform. In my jobs,
I had a doubt that shadow2 might occur a problem of memory corruption.
At first, I had found the problem as a hypervisor crash during
destruction of HVM domain with active VNIF on ia64 platform. The
reason of crash was that hypervisor detected P2M table used by
gnttab_copy in the HVM domain destruction. Thus I looked for a way
to avoid hypervisor crash in x86 code.
So, I found that:
* Before shadow2 age, x86 and ia64 use same logic for domain
destruction.
- at first, release gnttab references
- destruct page table for VCPU
- destruct P2M table for domain
- relinquish memory for domain
* After shadow2 age, x86 introduces delayed P2M table destruction.
- release gnttab references
- destruct page table for VCPU
- relinquish memory for domain
- destruct P2M table for domain in domain_destroy()
*** I don't have confidence in my investigation.
*** Am I right ?
I try to show the code that...
[common/domain.c]
203 void domain_kill(struct domain *d)
204 {
205 domain_pause(d);
206
207 if ( test_and_set_bit(_DOMF_dying, &d->domain_flags) )
208 return;
209
210 gnttab_release_mappings(d);
211 domain_relinquish_resources(d);
212 put_domain(d);
213
214 send_guest_global_virq(dom0, VIRQ_DOM_EXC);
215 }
[arch/x86/domain.c]
930 void domain_relinquish_resources(struct domain *d)
931 {
932 struct vcpu *v;
933 unsigned long pfn;
....
937 /* Drop the in-use references to page-table bases. */
938 for_each_vcpu ( d, v )
....
979 /* Relinquish every page of memory. */
980 relinquish_memory(d, &d->xenpage_list);
981 relinquish_memory(d, &d->page_list);
....
This is the code for domain_kill phase. I think that hypervisor
relinquishes memory for domain in this code.
In the other hand...
[common/domain.c]
322 /* Release resources belonging to task @p. */
323 void domain_destroy(struct domain *d)
324 {
325 struct domain **pd;
326 atomic_t old, new;
....
354 arch_domain_destroy(d);
355
356 free_domain(d);
357
358 send_guest_global_virq(dom0, VIRQ_DOM_EXC);
359 }
[arch/x86/domain.c]
237 void arch_domain_destroy(struct domain *d)
238 {
239 shadow_final_teardown(d);
....
[arch/x86/mm/shadow/common.c]
2580 void shadow_final_teardown(struct domain *d)
2581 /* Called by arch_domain_destroy(), when it's safe to pull down the p2m
map. */
2582 {
....
2597 /* It is now safe to pull down the p2m map. */
2598 if ( d->arch.shadow.p2m_pages != 0 )
2599 shadow_p2m_teardown(d);
In this code, P2M table are released.
If my speculation is correct, shadow2 may occur a problem of memory
corruption.
What do you think about this point ?
Thanks,
- Tsunehisa Doi
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|