Nakajima, Jun wrote:
>> And why would we need to take interrupts between loading esp0 and
>> LDT?
>>
>> load_esp0(t, thread);
>>
>> + local_irq_enable();
>> +
>> load_LDT(&init_mm.context);
>
> I thought it's required to get IPI working (for load_LDT and the other
> on-going flush TLB actitivies), but looks bogus after sleeping on it.
> I'm pretty sure that it resolves the hang, and it's hiding an
> underlying bug.
>
I've finally root caused it. It's much deeper than I expect...
Here is what's happening:
void arch_do_createdomain(struct vcpu *v)
{
...
l1_pgentry_t gdt_l1e;
...
d->arch.mm_perdomain_pt = alloc_xenheap_page();
memset(d->arch.mm_perdomain_pt, 0, PAGE_SIZE);
...
for ( vcpuid = 0; vcpuid < MAX_VIRT_CPUS; vcpuid++ )
d->arch.mm_perdomain_pt[
(vcpuid << PDPT_VCPU_SHIFT) + FIRST_RESERVED_GDT_PAGE] =
gdt_l1e;
The max value of (vcpuid << PDPT_VCPU_SHIFT) + FIRST_RESERVED_GDT_PAGE
is 1006 (< 1024), but the size of each entry is 8 bytes for PAE (and
x86_64), so alloc_xenheap_page() (i.e. a single page) was not
sufficient, and it's corrupting the next page which contains the areas
for vcpu_info, which contains evtchn_upcall_pending for vcpus. That
affected vcpu 7 (and 23) on my machine, and at load_LDT, we check the
pending events at hypercall_preempt_check(), and it's already on for
vcpu 7, but it's never cleared by hypercall4_create_continuation()
because nobody set such events... So it was looping there.
int do_mmuext_op(
struct mmuext_op *uops,
...
{
...
for ( i = 0; i < count; i++ )
{
if ( hypercall_preempt_check() )
{
rc = hypercall4_create_continuation(
__HYPERVISOR_mmuext_op, uops,
(count - i) | MMU_UPDATE_PREEMPTED, pdone, foreigndom);
break;
}
Signed-off-by: Jun Nakajima <jun.nakajima@xxxxxxxxx>
----
diff -r 9c7aeec94f8a xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c Tue Nov 15 19:46:48 2005 +0100
+++ b/xen/arch/x86/domain.c Wed Nov 16 23:23:44 2005 -0700
@@ -252,6 +252,8 @@
struct domain *d = v->domain;
l1_pgentry_t gdt_l1e;
int vcpuid;
+ physaddr_t size;
+ int order;
if ( is_idle_task(d) )
return;
@@ -265,9 +267,11 @@
SHARE_PFN_WITH_DOMAIN(virt_to_page(d->shared_info), d);
set_pfn_from_mfn(virt_to_phys(d->shared_info) >> PAGE_SHIFT,
INVALID_M2P_ENTRY);
-
- d->arch.mm_perdomain_pt = alloc_xenheap_page();
- memset(d->arch.mm_perdomain_pt, 0, PAGE_SIZE);
+ size = ((((MAX_VIRT_CPUS - 1) << PDPT_VCPU_SHIFT)
+ + FIRST_RESERVED_GDT_PAGE) * sizeof (l1_pgentry_t));
+ order = get_order_from_bytes(size);
+ d->arch.mm_perdomain_pt = alloc_xenheap_pages(order);
+ memset(d->arch.mm_perdomain_pt, 0, PAGE_SIZE << order);
set_pfn_from_mfn(virt_to_phys(d->arch.mm_perdomain_pt) >>
PAGE_SHIFT,
INVALID_M2P_ENTRY);
v->arch.perdomain_ptes = d->arch.mm_perdomain_pt;
Jun
---
Intel Open Source Technology Center
pae_smp_2.patch
Description: pae_smp_2.patch
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|