WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: Linux Stubdom Problem

To: Jiageng Yu <yujiageng734@xxxxxxxxx>
Subject: [Xen-devel] Re: Linux Stubdom Problem
From: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
Date: Fri, 29 Jul 2011 15:29:02 +0100
Cc: Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxx>, "Xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Samuel Thibault <samuel.thibault@xxxxxxxxxxxx>, Stefano Stabellini <Stefano.Stabellini@xxxxxxxxxxxxx>
Delivery-date: Fri, 29 Jul 2011 07:23:49 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <CAJ0pt175GrngJTxnvx0GRf1RvmXe_JAMxBch+SprKOArNx42ng@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <CAJ0pt14RBmT+bCGhU6szMW4Aje-mBQ-WVR8Vb7wOLefgauatbA@xxxxxxxxxxxxxx> <alpine.DEB.2.00.1107211815370.12963@kaball-desktop> <CAJ0pt15gNOhDb5K+oaqN2w8NxQ5HmtVp6YBYY+yZY6gpOST5+Q@xxxxxxxxxxxxxx> <CAJ0pt16xxQTqE==JGSZ-W=e4zwHV41Cs-wXE5V=x7SG5Aak+3g@xxxxxxxxxxxxxx> <alpine.DEB.2.00.1107271224350.12963@kaball-desktop> <CAJ0pt15tkb8F6LNHxSwjVmCF9DvvJjZqQKU-TXKyqT_seZibmw@xxxxxxxxxxxxxx> <alpine.DEB.2.00.1107271431070.12963@kaball-desktop> <CAJ0pt15-X7psh5Fzxzo0=8BR9G-hdVjdPqQO7CYLDCgNx9zNZg@xxxxxxxxxxxxxx> <CAJ0pt175GrngJTxnvx0GRf1RvmXe_JAMxBch+SprKOArNx42ng@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Alpine 2.00 (DEB 1167 2008-08-23)
On Thu, 28 Jul 2011, Jiageng Yu wrote:
> 2011/7/27 Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
> On Wed, 27 Jul 2011, Jiageng Yu wrote:
> > 2011/7/27 Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>:
> > > On Tue, 26 Jul 2011, Jiageng Yu wrote:
> > >> 2011/7/26 Jiageng Yu <yujiageng734@xxxxxxxxx>:
> > >> > 2011/7/22 Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>:
> > >> >> On Thu, 21 Jul 2011, Jiageng Yu wrote:
> > >> >>> 2011/7/19 Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>:
> > >> >>> > CC'ing Tim and xen-devel
> > >> >>> >
> > >> >>> > On Mon, 18 Jul 2011, Jiageng Yu wrote:
> > >> >>> >> 2011/7/16 Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>:
> > >> >>> >> > On Fri, 15 Jul 2011, Jiageng Yu wrote:
> > >> >>> >> >> 2011/7/15 Jiageng Yu <yujiageng734@xxxxxxxxx>:
> > >> >>> >> >> > 2011/7/15 Stefano Stabellini 
> > >> >>> >> >> > <stefano.stabellini@xxxxxxxxxxxxx>:
> > >> >>> >> >> >> On Fri, 15 Jul 2011, Jiageng Yu wrote:
> > >> >>> >> >> >>> > Does it mean you are actually able to boot an HVM guest 
> > >> >>> >> >> >>> > using Linux
> > >> >>> >> >> >>> > based stubdoms?? Did you manage to solve the framebuffer 
> > >> >>> >> >> >>> > problem too?
> > >> >>> >> >> >>>
> > >> >>> >> >> >>>
> > >> >>> >> >> >>> The HVM guest is booted. But the boot process is 
> > >> >>> >> >> >>> terminated because
> > >> >>> >> >> >>> vga bios is not invoked by seabios. I have got stuck here 
> > >> >>> >> >> >>> for a week.
> > >> >>> >> >> >>>
> > >> >>> >> >> >>
> > >> >>> >> >> >> There was a bug in xen-unstable.hg or seabios that would 
> > >> >>> >> >> >> prevent vga bios from
> > >> >>> >> >> >> being loaded, it should be fixed now.
> > >> >>> >> >> >>
> > >> >>> >> >> >> Alternatively you can temporarely work around the issue 
> > >> >>> >> >> >> with this hacky patch:
> > >> >>> >> >> >>
> > >> >>> >> >> >> ---
> > >> >>> >> >> >>
> > >> >>> >> >> >>
> > >> >>> >> >> >> diff -r 00d2c5ca26fd tools/firmware/hvmloader/hvmloader.c
> > >> >>> >> >> >> --- a/tools/firmware/hvmloader/hvmloader.c      Fri Jul 08 
> > >> >>> >> >> >> 18:35:24 2011 +0100
> > >> >>> >> >> >> +++ b/tools/firmware/hvmloader/hvmloader.c      Fri Jul 15 
> > >> >>> >> >> >> 11:37:12 2011 +0000
> > >> >>> >> >> >> @@ -430,7 +430,7 @@ int main(void)
> > >> >>> >> >> >>             bios->create_pir_tables();
> > >> >>> >> >> >>     }
> > >> >>> >> >> >>
> > >> >>> >> >> >> -    if ( bios->load_roms )
> > >> >>> >> >> >> +    if ( 1 )
> > >> >>> >> >> >>     {
> > >> >>> >> >> >>         switch ( virtual_vga )
> > >> >>> >> >> >>         {
> > >> >>> >> >> >>
> > >> >>> >> >> >>
> > >> >>> >> >> >
> > >> >>> >> >> > Yes. Vga bios is booted. However, the upstram qemu receives 
> > >> >>> >> >> > a SIGSEGV
> > >> >>> >> >> > signal subsequently. I am trying to print the call stack when
> > >> >>> >> >> > receiving the signal.
> > >> >>> >> >> >
> > >> >>> >> >>
> > >> >>> >> >> Hi,
> > >> >>> >> >>
> > >> >>> >> >>    I find the cause of SIGSEGV signal:
> > >> >>> >> >>
> > >> >>> >> >>    cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t 
> > >> >>> >> >> *buf, int
> > >> >>> >> >> len, int is_write)
> > >> >>> >> >>                   ->memcpy(buf, ptr + (addr & 
> > >> >>> >> >> ~TARGET_PAGE_MASK), l);
> > >> >>> >> >>
> > >> >>> >> >>     In my case, ptr=0 and addr=0xc253e, when qemu attempts to 
> > >> >>> >> >> vist
> > >> >>> >> >> 0x53e address, the SIGSEGV signal is generated.
> > >> >>> >> >>
> > >> >>> >> >>     I believe the qemu is trying to vist vram in this moment. 
> > >> >>> >> >> This
> > >> >>> >> >> code seems no problem, and I will continue to find the root 
> > >> >>> >> >> cause.
> > >> >>> >> >>
> > >> >>> >> >
> > >> >>> >> > The vram is allocated by qemu, see hw/vga.c:vga_common_init.
> > >> >>> >> > qemu_ram_alloc under xen ends up calling xen_ram_alloc that 
> > >> >>> >> > calls
> > >> >>> >> > xc_domain_populate_physmap_exact.
> > >> >>> >> > xc_domain_populate_physmap_exact is the hypercall that should 
> > >> >>> >> > ask Xen to
> > >> >>> >> > add the missing vram pages in the guest. Maybe this hypercall 
> > >> >>> >> > is failing
> > >> >>> >> > in your case?
> > >> >>> >>
> > >> >>> >>
> > >> >>> >> Hi,
> > >> >>> >>
> > >> >>> >>    I continue to invesgate this bug and find hypercall_mmu_update 
> > >> >>> >> in
> > >> >>> >> qemu_remap_bucket(xc_map_foreign_bulk) is failing:
> > >> >>> >>
> > >> >>> >> do_mmu_update
> > >> >>> >>       ->mod_l1_entry
> > >> >>> >>              ->  if ( !p2m_is_ram(p2mt) || unlikely(mfn == 
> > >> >>> >> INVALID_MFN) )
> > >> >>> >>                          return -EINVAL;
> > >> >>> >>
> > >> >>> >>    mfn==INVALID_MFN, because :
> > >> >>> >>
> > >> >>> >> mod_l1_entry
> > >> >>> >>       ->gfn_to_mfn(p2m_get_hostp2m(pg_dom), l1e_get_pfn(nl1e), 
> > >> >>> >> &p2mt));
> > >> >>> >>               ->p2m->get_entry
> > >> >>> >>                         ->p2m_gfn_to_mfn
> > >> >>> >>                                -> if ( gfn > p2m->max_mapped_pfn )
> > >> >>> >>                                    /* This pfn is higher than the
> > >> >>> >> highest the p2m map currently holds */
> > >> >>> >>                                    return _mfn(INVALID_MFN);
> > >> >>> >>
> > >> >>> >>    The p2m->max_mapped_pfn is usually 0xfffff. In our case,
> > >> >>> >> mmu_update.val exceeds 0x8000000100000000.  Additionally, l1e =
> > >> >>> >> l1e_from_intpte(mmu_update.val); gfn=l1e_get_pfn(l1e ). 
> > >> >>> >> Therefore, gfn
> > >> >>> >> will exceed 0xfffff.
> > >> >>> >>
> > >> >>> >>    In the case of minios based stubdom, the mmu_update.vals do not
> > >> >>> >> exceed 0x8000000100000000. Next, I will invesgate why 
> > >> >>> >> mmu_update.val
> > >> >>> >> exceeds 0x8000000100000000.
> > >> >>> >
> > >> >>> > It looks like the address of the guest that qemu is trying to map 
> > >> >>> > is not
> > >> >>> > valid.
> > >> >>> > Make sure you are running a guest with less than 2GB of ram, 
> > >> >>> > otherwise
> > >> >>> > you need the patch series that Anthony sent on Friday:
> > >> >>> >
> > >> >>> > http://marc.info/?l=qemu-devel&m=131074042905711&w=2
> > >> >>>
> > >> >>> Not this problem. I never alloc more than 2GB for the hvm guest. The
> > >> >>> call stack in qemu is:
> > >> >>>
> > >> >>> qemu_get_ram_ptr
> > >> >>>       ->qemu_map_cache(addr, 0, 1)
> > >> >>>                  -> if (!entry->vaddr_base || entry->paddr_index !=
> > >> >>> address_index ||
> > >> >>>                                           !test_bit(address_offset >>
> > >> >>> XC_PAGE_SHIFT, entry->valid_mapping)) {
> > >> >>>                            qemu_remap_bucket(entry, size ? :
> > >> >>> MCACHE_BUCKET_SIZE, address_index);
> > >> >>>                                  ->xc_map_foreign_bulk(xen_xc,
> > >> >>> xen_domid, PROT_READ|PROT_WRITE,
> > >> >>>
> > >> >>>                 pfns, err, nb_pfn);
> > >> >>>
> > >> >>> The qemu tries to map pages from hvm guest(xen_domid) to linux
> > >> >>> stubdom. But some hvm pages' pfns are larger than 0xfffff. So, in the
> > >> >>> p2m_gfn_to_mfn, the judgement condition is valid:(p2m->max_mapped_pfn
> > >> >>> = 0xfffff)
> > >> >>>
> > >> >>>     if ( gfn > p2m->max_mapped_pfn )
> > >> >>>         /* This pfn is higher than the highest the p2m map currently 
> > >> >>> holds */
> > >> >>>         return _mfn(INVALID_MFN);
> > >> >>>
> > >> >>>  In minios stubdom case, the hvm pages' pfns do not exceed 0xfffff.
> > >> >>> Maybe the address translation in linux stubdom cause this probem?
> > >> >>
> > >> >> Trying to map a pfn > 0xfffff is clearly a mistake if the guest's 
> > >> >> memory
> > >> >> does not exceed 2G:
> > >> >>
> > >> >> 0xfffff * 4096 > 2G
> > >> >>
> > >> >>
> > >> >>>  BTW, in minios stubdom case, there seems no hvmloader process. Is it
> > >> >>> needed in linux stubdom?
> > >> >>
> > >> >> hvmloader is the first thing that runs within the guest, it is not a
> > >> >> process in the stubdom or in dom0.
> > >> >> It is required in both minios and linux stubdoms.
> > >> >
> > >> > Hi Stefano,
> > >> >
> > >> >      I patched these patches, but we still have the same problem.
> > >> > However, I noticed the qemu_get_ram_ptr(s->vram_offset) in
> > >> > vga_common_init function was also failed. Maybe this can explain the
> > >> > previous problem, which happened in the phase of trying to remap
> > >> > 0xc0000-0xc8fff of hvm guest into stubdom.
> > >> >
> > >> >     I have traced the process of qemu_get_ram_ptr(s->vram_offset) and
> > >> > located the failure in p2m_gfn_to_mfn function:
> > >> >
> > >> >     pod_retry_l3:
> > >> >        if ( (l3e_get_flags(*l3e) & _PAGE_PRESENT) == 0 )
> > >> >        {
> > >> >                 .....
> > >> >                 return _mfn(INVALID_MFN);
> > >> >        }
> > >> >
> > >> >     I will continue to analyze this failure.
> > >> >
> > >> >     Thanks!
> > >> >
> > >> > Jiageng Yu.
> > >> >
> > >>
> > >>
> > >> Hi,
> > >>
> > >>     I compared the two executions of vga_common_init function in dom0
> > >> and linux based stubdom. The former succeeded and the later was
> > >> failed. First, they have the same call stack:
> > >>
> > >> Dom0 & Stubdom
> > >> _________________________________________________________
> > >> vga_common_init
> > >>      ->qemu_get_ram_ptr(s->vram_offset)
> > >>            ->block->host = xen_map_block(block->offset, block->length);
> > >>                  ->xc_map_foreign_bulk()
> > >>                         ->linux_privcmd_map_foreign_bulk()
> > >>                                ->xen_remap_domain_mfn_range()
> > >>                                      ->HYPERVISOR_mmu_update()
> > >> __________________________________________________________
> > >>
> > >> Xen
> > >> __________________________________________________________
> > >> do_mmu_update()
> > >>    ->case MMU_PT_UPDATE_PRESERVE_AD:
> > >>    ->case PGT_l1_page_table:
> > >>    ->mod_l1_entry(va, l1e, mfn,cmd == MMU_PT_UPDATE_PRESERVE_AD, v, 
> > >> pg_owner);
> > >>           ->mfn_x(gfn_to_mfn(p2m_get_hostp2m(pg_dom),
> > >> l1e_get_pfn(nl1e), &p2mt));
> > >>                  ->gfn_to_mfn_type_p2m()
> > >>                         ->p2m->get_entry(p2m, gfn, t, &a, q);
> > >>                                ->p2m_gfn_to_mfn(p2m,gfn,t,&a,q)
> > >>                                       ->if ( (l3e_get_flags(*l3e) &
> > >> _PAGE_PRESENT) == 0 )
> > >>                                       ->    Error happens!
> > >>
> > >> The qemu in dom0 can find the l3e of hvm guest, but the qemu in linux
> > >> stubdom cannot find the l3e. In my case, s->vram_offset=0x40000000,
> > >> vga_ram_size=0x800000. Therefore, we are going to map hvm guest's
> > >> address area(pfn:0x40000, size:8M) into linux stubdom's address space.
> > >>
> > >> In p2m_gfn_to_mfn function, p2m->domain->domain_id=hvm guest,
> > >> gfn=0x40000, t=p2m_mmio_dm.
> > >> mfn = pagetable_get_mfn(p2m_get_pagetable(p2m) = 0x10746e;
> > >> map_domain_page(mfn_x(mfn)) is also success. However, after executing:
> > >> l3e += ( (0x40000 << PAGE_SHIFT) >> L3_PAGETABLE_SHIFT)
> > >> the l3e->l3 =0 , and the error happens.
> > >>
> > >> So, in linux stubdom, when we are going to map the specified hvm
> > >> guest's address(pfn:0x40000, size:8M), we find these pages of hvm
> > >> guest are not present. This is never happened in qemu of dom0. Could
> > >> you give me some prompts to this problem?
> > >
> > >
> > > It seems that you are trying to map pages that don't exist.
> > > The pages in question should be allocated by:
> > >
> > > qemu_ram_alloc(NULL, "vga.vram", vga_ram_size)
> > >    qemu_ram_alloc_from_ptr
> > >        xen_ram_alloc
> > >            xc_domain_populate_physmap_exact
> > >
> > > so I would add some printf and printk on this code path to find out if
> > > xc_domain_populate_physmap_exact fails for some reasons.
> >
> > Hmm.. the linux stubdom kernel had a wrong p2m pair
> > <gfn(0x40000),mfn(0x127bd2)> for some reason. But next, the
> > xc_domain_populate_physmap_exact will setup the correct p2m pair
> > <gfn(0x40000),mfn(0x896b7)>. However, the p2m pair in stubdom kernel
> > has not been updated, because the fllowing access to 0x40000 still
> > uses 0x127bd2.
> 
> The p2m for the guest domain is only one in Xen, so I cannot understand
> how it is possible that you get the old mfn value.
> Also there shouldn't even be an old value because before
> xc_domain_populate_physmap_exact pfn 0x40000 wasn't even allocated in
> the guest yet.
> 
> Make sure you are using the right domid in both calls
> (xc_domain_populate_physmap_exact and xc_map_foreign_bulk), also make
> sure that libxenlight calls xc_domain_set_target and xs_set_target for
> the stubdom otherwise the stubdom is not going to be privileged enough
> to allocate and map memory of the guest.
> 
> 
> > I notice you have a patch: xen: modify kernel mappings corresponding
> > to granted pages. I think maybe it could slove my problem.
> 
> That patch fixes a different issue, related to grant table mappings.
> 
> 
>  
> OK. That is my fault.
>  
> The root cause of previous problem is that the backend drivers in qemu are 
> not stopped. To confirm this root cause,
> I try to erase the codes about stubdom in xen_be_init function of old qemu. 
> The same problem appears. The following
> patch is to fix this issue in upstream qemu.
>  
> diff --git a/xen-all.c b/xen-all.c
> index b73fc43..8f0645e 100644
> --- a/xen-all.c
> +++ b/xen-all.c
> @@ -472,12 +479,23 @@ static void cpu_handle_ioreq(void *opaque)
>  static void xenstore_record_dm_state(XenIOState *s, const char *state)
>  {
>      char path[50];
> +#ifdef CONFIG_STUBDOM
> +    s->xenstore = xs_daemon_open();
> +    if (s->xenstore == NULL) {
> +        perror("xen: xenstore open");
> +        return -errno;
> +    }
> +#endif
>      snprintf(path, sizeof (path), "/local/domain/0/device-model/%u/state", 
> xen_domid);
>      if (!xs_write(s->xenstore, XBT_NULL, path, state, strlen(state))) {
>          fprintf(stderr, "error recording dm state\n");
>          exit(1);
>      }
> +#ifdef CONFIG_STUBDOM
> +    xs_daemon_close(s->xenstore);
> +    s->xenstore = NULL;
> +#endif
>  }

Why do you need to re-open the xenstore connection here?
It should be already been opened by xen_hvm_init, like in the normal
case.


>  static void xen_main_loop_prepare(XenIOState *state)
> @@ -538,6 +556,7 @@ int xen_hvm_init(void)
>  
>      state = qemu_mallocz(sizeof (XenIOState));
>  
> +#ifndef CONFIG_STUBDOM
>      state->xce_handle = xen_xc_evtchn_open(NULL, 0);
>      if (state->xce_handle == XC_HANDLER_INITIAL_VALUE) {
>          perror("xen: event channel open");
> @@ -549,6 +568,10 @@ int xen_hvm_init(void)
>          perror("xen: xenstore open");
>          return -errno;
>      }
> +#else
> +    state->xce_handle = XC_HANDLER_INITIAL_VALUE;
> +    state->xenstore = NULL;
> +#endif
>  

So you are explicitly avoiding to open the xenstore connection from
xen_hvm_init, why?
I think you might be trying to fix a race condition, maybe something is
not ready yet at this point that becomes ready later?


>      state->exit.notify = xen_exit_notifier;
>      qemu_add_exit_notifier(&state->exit);
> @@ -575,9 +598,10 @@ int xen_hvm_init(void)
>  
>      state->ioreq_local_port = qemu_mallocz(smp_cpus * sizeof 
> (evtchn_port_t));
>  
>      /* FIXME: how about if we overflow the page here? */
>      for (i = 0; i < smp_cpus; i++) {
> -        rc = xc_evtchn_bind_interdomain(state->xce_handle, xen_domid,
> +       rc = xc_evtchn_bind_interdomain(xen_xc, xen_domid,
>                                          xen_vcpu_eport(state->shared_page, 
> i));
>          if (rc == -1) {
>              fprintf(stderr, "bind interdomain ioctl error %d\n", errno);
> 

This cannot be right: xc_evtchn_bind_interdomain takes a xc_evtchn* as
first paramter while xen_xc is xc_interface*
This change would prevent you from receiving any IO request
notifications from Xen.

 
> The new problem is my stubdom hangs at:
>  
> hvmloader:
>      ->main()
>             ->pci_setup()
>                     ->pci_writeb(PCI_ISA_DEVFN, 0x60 + link, isa_irq);
>  
> I am investigating this problem. The pci_writeb will finally call the 
> hvm_set_pci_link_route in Xen:
>  
> hvmloader:             pci_writeb(PCI_ISA_DEVFN, 0x60 + link, isa_irq);
> qemu(stubdom):     PCIHostState->data_handler->write()
> qemu(stubdom):     i440fx_write_config_xen()
> qemu(stubdom):     xen_piix_pci_write_config_client()
> xenctrl:                   xc_hvm_set_pci_link_route()
>  
> The ioport is registered by pci_host_data_register_ioport(0xcfc, s) function.
>  
> I will find out why not invoke i440fx_write_config_xen() in my case. I will 
> also read the pciutils.patch of minios stubdom
> and maybe find something interesting.
 
I think you are not receiving any IO request notifications from Xen
because of the previous change.
It is probably worth adding a printf into xen-all.c:handle_ioreq to see
if you receive something.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel