[Xen-devel] Re: Linux Stubdom Problem

2011/7/28 Jiageng Yu <yujiageng734@xxxxxxxxx>

2011/7/27 Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>

On Wed, 27 Jul 2011, Jiageng Yu wrote:
> 2011/7/27 Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>:
> > On Tue, 26 Jul 2011, Jiageng Yu wrote:
> >> 2011/7/26 Jiageng Yu <yujiageng734@xxxxxxxxx>:
> >> > 2011/7/22 Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>:
> >> >> On Thu, 21 Jul 2011, Jiageng Yu wrote:
> >> >>> 2011/7/19 Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>:
> >> >>> > CC'ing Tim and xen-devel
> >> >>> >
> >> >>> > On Mon, 18 Jul 2011, Jiageng Yu wrote:
> >> >>> >> 2011/7/16 Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>:
> >> >>> >> > On Fri, 15 Jul 2011, Jiageng Yu wrote:
> >> >>> >> >> 2011/7/15 Jiageng Yu <yujiageng734@xxxxxxxxx>:
> >> >>> >> >> > 2011/7/15 Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>:
> >> >>> >> >> >> On Fri, 15 Jul 2011, Jiageng Yu wrote:
> >> >>> >> >> >>> > Does it mean you are actually able to boot an HVM guest using Linux
> >> >>> >> >> >>> > based stubdoms?? Did you manage to solve the framebuffer problem too?
> >> >>> >> >> >>>
> >> >>> >> >> >>>
> >> >>> >> >> >>> The HVM guest is booted. But the boot process is terminated because
> >> >>> >> >> >>> vga bios is not invoked by seabios. I have got stuck here for a week.
> >> >>> >> >> >>>
> >> >>> >> >> >>
> >> >>> >> >> >> There was a bug in xen-unstable.hg or seabios that would prevent vga bios from
> >> >>> >> >> >> being loaded, it should be fixed now.
> >> >>> >> >> >>
> >> >>> >> >> >> Alternatively you can temporarely work around the issue with this hacky patch:
> >> >>> >> >> >>
> >> >>> >> >> >> ---
> >> >>> >> >> >>
> >> >>> >> >> >>
> >> >>> >> >> >> diff -r 00d2c5ca26fd tools/firmware/hvmloader/hvmloader.c
> >> >>> >> >> >> --- a/tools/firmware/hvmloader/hvmloader.c Fri Jul 08 18:35:24 2011 +0100
> >> >>> >> >> >> +++ b/tools/firmware/hvmloader/hvmloader.c Fri Jul 15 11:37:12 2011 +0000
> >> >>> >> >> >> @@ -430,7 +430,7 @@ int main(void)
> >> >>> >> >> >> bios->create_pir_tables();
> >> >>> >> >> >> }
> >> >>> >> >> >>
> >> >>> >> >> >> - if ( bios->load_roms )
> >> >>> >> >> >> + if ( 1 )
> >> >>> >> >> >> {
> >> >>> >> >> >> switch ( virtual_vga )
> >> >>> >> >> >> {
> >> >>> >> >> >>
> >> >>> >> >> >>
> >> >>> >> >> >
> >> >>> >> >> > Yes. Vga bios is booted. However, the upstram qemu receives a SIGSEGV
> >> >>> >> >> > signal subsequently. I am trying to print the call stack when
> >> >>> >> >> > receiving the signal.
> >> >>> >> >> >
> >> >>> >> >>
> >> >>> >> >> Hi,
> >> >>> >> >>
> >> >>> >> >> I find the cause of SIGSEGV signal:
> >> >>> >> >>
> >> >>> >> >> cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf, int
> >> >>> >> >> len, int is_write)
> >> >>> >> >> ->memcpy(buf, ptr + (addr & ~TARGET_PAGE_MASK), l);
> >> >>> >> >>
> >> >>> >> >> In my case, ptr=0 and addr=0xc253e, when qemu attempts to vist
> >> >>> >> >> 0x53e address, the SIGSEGV signal is generated.
> >> >>> >> >>
> >> >>> >> >> I believe the qemu is trying to vist vram in this moment. This
> >> >>> >> >> code seems no problem, and I will continue to find the root cause.
> >> >>> >> >>
> >> >>> >> >
> >> >>> >> > The vram is allocated by qemu, see hw/vga.c:vga_common_init.
> >> >>> >> > qemu_ram_alloc under xen ends up calling xen_ram_alloc that calls
> >> >>> >> > xc_domain_populate_physmap_exact.
> >> >>> >> > xc_domain_populate_physmap_exact is the hypercall that should ask Xen to
> >> >>> >> > add the missing vram pages in the guest. Maybe this hypercall is failing
> >> >>> >> > in your case?
> >> >>> >>
> >> >>> >>
> >> >>> >> Hi,
> >> >>> >>
> >> >>> >> I continue to invesgate this bug and find hypercall_mmu_update in
> >> >>> >> qemu_remap_bucket(xc_map_foreign_bulk) is failing:
> >> >>> >>
> >> >>> >> do_mmu_update
> >> >>> >> ->mod_l1_entry
> >> >>> >> -> if ( !p2m_is_ram(p2mt) || unlikely(mfn == INVALID_MFN) )
> >> >>> >> return -EINVAL;
> >> >>> >>
> >> >>> >> mfn==INVALID_MFN, because :
> >> >>> >>
> >> >>> >> mod_l1_entry
> >> >>> >> ->gfn_to_mfn(p2m_get_hostp2m(pg_dom), l1e_get_pfn(nl1e), &p2mt));
> >> >>> >> ->p2m->get_entry
> >> >>> >> ->p2m_gfn_to_mfn
> >> >>> >> -> if ( gfn > p2m->max_mapped_pfn )
> >> >>> >> /* This pfn is higher than the
> >> >>> >> highest the p2m map currently holds */
> >> >>> >> return _mfn(INVALID_MFN);
> >> >>> >>
> >> >>> >> The p2m->max_mapped_pfn is usually 0xfffff. In our case,
> >> >>> >> mmu_update.val exceeds 0x8000000100000000. Additionally, l1e =
> >> >>> >> l1e_from_intpte(mmu_update.val); gfn=l1e_get_pfn(l1e ). Therefore, gfn
> >> >>> >> will exceed 0xfffff.
> >> >>> >>
> >> >>> >> In the case of minios based stubdom, the mmu_update.vals do not
> >> >>> >> exceed 0x8000000100000000. Next, I will invesgate why mmu_update.val
> >> >>> >> exceeds 0x8000000100000000.
> >> >>> >
> >> >>> > It looks like the address of the guest that qemu is trying to map is not
> >> >>> > valid.
> >> >>> > Make sure you are running a guest with less than 2GB of ram, otherwise
> >> >>> > you need the patch series that Anthony sent on Friday:
> >> >>> >
> >> >>> > http://marc.info/?l=qemu-devel&m=131074042905711&w=2
> >> >>>
> >> >>> Not this problem. I never alloc more than 2GB for the hvm guest. The
> >> >>> call stack in qemu is:
> >> >>>
> >> >>> qemu_get_ram_ptr
> >> >>> ->qemu_map_cache(addr, 0, 1)
> >> >>> -> if (!entry->vaddr_base || entry->paddr_index !=
> >> >>> address_index ||
> >> >>> !test_bit(address_offset >>
> >> >>> XC_PAGE_SHIFT, entry->valid_mapping)) {
> >> >>> qemu_remap_bucket(entry, size ? :
> >> >>> MCACHE_BUCKET_SIZE, address_index);
> >> >>> ->xc_map_foreign_bulk(xen_xc,
> >> >>> xen_domid, PROT_READ|PROT_WRITE,
> >> >>>
> >> >>> pfns, err, nb_pfn);
> >> >>>
> >> >>> The qemu tries to map pages from hvm guest(xen_domid) to linux
> >> >>> stubdom. But some hvm pages' pfns are larger than 0xfffff. So, in the
> >> >>> p2m_gfn_to_mfn, the judgement condition is valid:(p2m->max_mapped_pfn
> >> >>> = 0xfffff)
> >> >>>
> >> >>> if ( gfn > p2m->max_mapped_pfn )
> >> >>> /* This pfn is higher than the highest the p2m map currently holds */
> >> >>> return _mfn(INVALID_MFN);
> >> >>>
> >> >>> In minios stubdom case, the hvm pages' pfns do not exceed 0xfffff.
> >> >>> Maybe the address translation in linux stubdom cause this probem?
> >> >>
> >> >> Trying to map a pfn > 0xfffff is clearly a mistake if the guest's memory
> >> >> does not exceed 2G:
> >> >>
> >> >> 0xfffff * 4096 > 2G
> >> >>
> >> >>
> >> >>> BTW, in minios stubdom case, there seems no hvmloader process. Is it
> >> >>> needed in linux stubdom?
> >> >>
> >> >> hvmloader is the first thing that runs within the guest, it is not a
> >> >> process in the stubdom or in dom0.
> >> >> It is required in both minios and linux stubdoms.
> >> >
> >> > Hi Stefano,
> >> >
> >> > I patched these patches, but we still have the same problem.
> >> > However, I noticed the qemu_get_ram_ptr(s->vram_offset) in
> >> > vga_common_init function was also failed. Maybe this can explain the
> >> > previous problem, which happened in the phase of trying to remap
> >> > 0xc0000-0xc8fff of hvm guest into stubdom.
> >> >
> >> > I have traced the process of qemu_get_ram_ptr(s->vram_offset) and
> >> > located the failure in p2m_gfn_to_mfn function:
> >> >
> >> > pod_retry_l3:
> >> > if ( (l3e_get_flags(*l3e) & _PAGE_PRESENT) == 0 )
> >> > {
> >> > .....
> >> > return _mfn(INVALID_MFN);
> >> > }
> >> >
> >> > I will continue to analyze this failure.
> >> >
> >> > Thanks!
> >> >
> >> > Jiageng Yu.
> >> >
> >>
> >>
> >> Hi,
> >>
> >> I compared the two executions of vga_common_init function in dom0
> >> and linux based stubdom. The former succeeded and the later was
> >> failed. First, they have the same call stack:
> >>
> >> Dom0 & Stubdom
> >> _________________________________________________________
> >> vga_common_init
> >> ->qemu_get_ram_ptr(s->vram_offset)
> >> ->block->host = xen_map_block(block->offset, block->length);
> >> ->xc_map_foreign_bulk()
> >> ->linux_privcmd_map_foreign_bulk()
> >> ->xen_remap_domain_mfn_range()
> >> ->HYPERVISOR_mmu_update()
> >> __________________________________________________________
> >>
> >> Xen
> >> __________________________________________________________
> >> do_mmu_update()
> >> ->case MMU_PT_UPDATE_PRESERVE_AD:
> >> ->case PGT_l1_page_table:
> >> ->mod_l1_entry(va, l1e, mfn,cmd == MMU_PT_UPDATE_PRESERVE_AD, v, pg_owner);
> >> ->mfn_x(gfn_to_mfn(p2m_get_hostp2m(pg_dom),
> >> l1e_get_pfn(nl1e), &p2mt));
> >> ->gfn_to_mfn_type_p2m()
> >> ->p2m->get_entry(p2m, gfn, t, &a, q);
> >> ->p2m_gfn_to_mfn(p2m,gfn,t,&a,q)
> >> ->if ( (l3e_get_flags(*l3e) &
> >> _PAGE_PRESENT) == 0 )
> >> -> Error happens!
> >>
> >> The qemu in dom0 can find the l3e of hvm guest, but the qemu in linux
> >> stubdom cannot find the l3e. In my case, s->vram_offset=0x40000000,
> >> vga_ram_size=0x800000. Therefore, we are going to map hvm guest's
> >> address area(pfn:0x40000, size:8M) into linux stubdom's address space.
> >>
> >> In p2m_gfn_to_mfn function, p2m->domain->domain_id=hvm guest,
> >> gfn=0x40000, t=p2m_mmio_dm.
> >> mfn = pagetable_get_mfn(p2m_get_pagetable(p2m) = 0x10746e;
> >> map_domain_page(mfn_x(mfn)) is also success. However, after executing:
> >> l3e += ( (0x40000 << PAGE_SHIFT) >> L3_PAGETABLE_SHIFT)
> >> the l3e->l3 =0 , and the error happens.
> >>
> >> So, in linux stubdom, when we are going to map the specified hvm
> >> guest's address(pfn:0x40000, size:8M), we find these pages of hvm
> >> guest are not present. This is never happened in qemu of dom0. Could
> >> you give me some prompts to this problem?
> >
> >
> > It seems that you are trying to map pages that don't exist.
> > The pages in question should be allocated by:
> >
> > qemu_ram_alloc(NULL, "vga.vram", vga_ram_size)
> > qemu_ram_alloc_from_ptr
> > xen_ram_alloc
> > xc_domain_populate_physmap_exact
> >
> > so I would add some printf and printk on this code path to find out if
> > xc_domain_populate_physmap_exact fails for some reasons.
>
> Hmm.. the linux stubdom kernel had a wrong p2m pair
> <gfn(0x40000),mfn(0x127bd2)> for some reason. But next, the
> xc_domain_populate_physmap_exact will setup the correct p2m pair
> <gfn(0x40000),mfn(0x896b7)>. However, the p2m pair in stubdom kernel
> has not been updated, because the fllowing access to 0x40000 still
> uses 0x127bd2.

The p2m for the guest domain is only one in Xen, so I cannot understand
how it is possible that you get the old mfn value.
Also there shouldn't even be an old value because before
xc_domain_populate_physmap_exact pfn 0x40000 wasn't even allocated in
the guest yet.

Make sure you are using the right domid in both calls
(xc_domain_populate_physmap_exact and xc_map_foreign_bulk), also make
sure that libxenlight calls xc_domain_set_target and xs_set_target for
the stubdom otherwise the stubdom is not going to be privileged enough
to allocate and map memory of the guest.

> I notice you have a patch: xen: modify kernel mappings corresponding
> to granted pages. I think maybe it could slove my problem.

That patch fixes a different issue, related to grant table mappings.

OK. That is my fault.

The root cause of previous problem is that the backend drivers in qemu are not stopped. To confirm this root cause, I try to erase the codes about stubdom in xen_be_init function of old qemu. The same problem appears. The following patch is to fix this issue in upstream qemu.

diff --git a/xen-all.c b/xen-all.c
index b73fc43..8f0645e 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -472,12 +479,23 @@ static void cpu_handle_ioreq(void *opaque)
static void xenstore_record_dm_state(XenIOState *s, const char *state)
{
     char path[50];
+#ifdef CONFIG_STUBDOM
+    s->xenstore = xs_daemon_open();
+    if (s->xenstore == NULL) {
+        perror("xen: xenstore open");
+        return -errno;
+    }
+#endif
     snprintf(path, sizeof (path), "/local/domain/0/device-model/%u/state", xen_domid);
     if (!xs_write(s->xenstore, XBT_NULL, path, state, strlen(state))) {
         fprintf(stderr, "error recording dm state\n");
         exit(1);
     }
+#ifdef CONFIG_STUBDOM
+    xs_daemon_close(s->xenstore);
+    s->xenstore = NULL;
+#endif
}

static void xen_main_loop_prepare(XenIOState *state)
@@ -538,6 +556,7 @@ int xen_hvm_init(void)

     state = qemu_mallocz(sizeof (XenIOState));

+#ifndef CONFIG_STUBDOM
     state->xce_handle = xen_xc_evtchn_open(NULL, 0);
     if (state->xce_handle == XC_HANDLER_INITIAL_VALUE) {
         perror("xen: event channel open");
@@ -549,6 +568,10 @@ int xen_hvm_init(void)
         perror("xen: xenstore open");
         return -errno;
     }
+#else
+    state->xce_handle = XC_HANDLER_INITIAL_VALUE;
+    state->xenstore = NULL;
+#endif

     state->exit.notify = xen_exit_notifier;
     qemu_add_exit_notifier(&state->exit);
@@ -575,9 +598,10 @@ int xen_hvm_init(void)

     state->ioreq_local_port = qemu_mallocz(smp_cpus * sizeof (evtchn_port_t));

     /* FIXME: how about if we overflow the page here? */
     for (i = 0; i < smp_cpus; i++) {
-        rc = xc_evtchn_bind_interdomain(state->xce_handle, xen_domid,
+       rc = xc_evtchn_bind_interdomain(xen_xc, xen_domid,
                                         xen_vcpu_eport(state->shared_page, i));
         if (rc == -1) {
             fprintf(stderr, "bind interdomain ioctl error %d\n", errno);

The new problem is my stubdom hangs at:

hvmloader:

->main()

->pci_setup()

->pci_writeb(PCI_ISA_DEVFN, 0x60 + link, isa_irq);

I am investigating this problem. The pci_writeb will finally call the hvm_set_pci_link_route in Xen:

hvmloader: pci_writeb(PCI_ISA_DEVFN, 0x60 + link, isa_irq);

qemu(stubdom): PCIHostState->data_handler->write()

qemu(stubdom): i440fx_write_config_xen()

qemu(stubdom): xen_piix_pci_write_config_client()

xenctrl: xc_hvm_set_pci_link_route()

The ioport is registered by pci_host_data_register_ioport(0xcfc, s) function.

I will find out why not invoke i440fx_write_config_xen() in my case. I will also read the pciutils.patch of minios stubdom and maybe find something interesting.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] Re: Linux Stubdom Problem