> -----Original Message-----
> From: Pradeep Singh, TLS-Chennai [mailto:pradeep_s@xxxxxx]
> Sent: 12 March 2007 12:39
> To: Petersson, Mats; xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: [Xen-devel] paging mechanism clarification
>
>
>
> -----Original Message-----
> From: Petersson, Mats [mailto:Mats.Petersson@xxxxxxx]
> Sent: Mon 12-Mar-07 5:35 PM
> To: Pradeep Singh, TLS-Chennai; xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: [Xen-devel] paging mechanism clarification
>
>
>
> > -----Original Message-----
> > From: Pradeep Singh, TLS-Chennai [mailto:pradeep_s@xxxxxx]
> > Sent: 12 March 2007 11:34
> > To: Petersson, Mats; xen-devel@xxxxxxxxxxxxxxxxxxx
> > Subject: RE: [Xen-devel] paging mechanism clarification
> >
> >
> >
> > -----Original Message-----
> > From: Petersson, Mats [mailto:Mats.Petersson@xxxxxxx]
> > Sent: Mon 12-Mar-07 4:13 PM
> > To: Pradeep Singh, TLS-Chennai; xen-devel@xxxxxxxxxxxxxxxxxxx
> > Subject: RE: [Xen-devel] paging mechanism clarification
> >
> >
> >
> > > -----Original Message-----
> > > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> > > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of
> > > Pradeep Singh, TLS-Chennai
> > > Sent: 12 March 2007 05:56
> > > To: xen-devel@xxxxxxxxxxxxxxxxxxx
> > > Subject: [Xen-devel] paging mechanism clarification
> > >
> > >
> > > Hi All,
> > >
> > > The Xen uses 2 level Paging Mechanism to resolve the Virtual
> > > Addresses into frame numbers from domU.The first level is
> > > taken care by the MMU for the domU i.e translation from
> > > virtual address to the physical address just like normal
> > > paging mechanism. The second level of translation is done by
> > > the Xen hypervisor.It translates the pseduo physical address
> > > recieved from the domU and treats it as a normal virtual
> > > address and finds the page frame using regualr paging mechanism.
> >
> > No, not in the current model.
> >
> > Does it include xen-3.0.3 also?
> > I hope it is true for whole xen3 series?
>
> Yes, 3.0.3 is very similar (if not exactly the same) as 3.0.5
> (currently
> called "unstable").
> >
> > The paging in HVM (fully virtualized domains) is managed by
> the Shadow
> > paging, which simplified works like this:
> > When paging is disabled in the guest, still enable paging in the
> > processor and give a CR3 to the processor that points to a
> > map of where
> > the guest memory is.
> >
> > Mats, when paging is disabled in guest( which i guess is the
> > case during booting of the Xen domU ), how is this related to
> > the paging on the processor?
> > Even if guest is booting or paging is disabled in the guest,
> > hypervisor should be free from this non-paging guest
> > instance.How is paging in the hypervisor dependent on the
> > paging on the guest?
>
> When the system boots, the processor is normally in "real-mode", and
> it's definitely not got paging enabled. So we have to "make
> the guest OS
> believe this is the case". But at the same time, the guest OS is most
> likely not loaded at address zero in memory, so we need paging enabled
> to remap the GUEST PHYSICAL address to match the machine physical
> address. So we have a "linear map" to translate the "address zero" to
> the "start of guest memory", and so on for every page of memory in the
> guest.
>
> This is not hard to do, since the AMD-V/VT feature of the processor
> expects the paging-bit to be different between what the guest "thinks"
> and the actual case. In the AMD-V, there's even support to
> run real-mode
> with paging enabled, so all the BIOS-code and such will be running in
> this mode. VT has to do a bunch of tricky stuff to work around that
> problem.
>
> Ok fine, does this argument holds true for even non-VT and
> non-Pacifica enabled processors?
> I doubt it.
Not precisely. I'm talking only about HVM mode, which is "full
virtualization". PV-mode uses a different paging interface, which at
least for most parts, comprise of changing the whole area of code in the
kernel that updates the page-tables, by adding code that is aware of the
THREE types of address (guest-virtual, guest-physical and
machine-physical). This means that there's no real need for the
"read-only page-tables" and "shadow-mode" - the page-table just contains
the right value for the machine-physical address. [That's not to say
that read-only page-tables can't be used in a PV system too - I'm not
100% sure how the page-table management works in the PV mode].
>
> >
> > I hope i made myself clear.
> > Please enlighten me :-).
> >
> > When paging is enabled, we use a shadow page-table, which is
> > essentially
> > that the GUEST sees one page-table, and the processor another
> > (thanks to
> > the fact that the hypervisor intercepts the CR3 read/write
> operations,
> > and when CR3 is read back by the guest, we don't send back the value
> > it's ACTUALLY POINTING TO IN THE PROCESSOR, but the value
> that was set
> > by the guest). So there are two page-tables.
> >
> > Got this well, thanks Mats :).
> >
> > To make the page-table updates by the guest visible to the
> hypervisor,
> > all of the guest-page-tables are made read-only (by scanning
> > the new CR3
> > value whenever one is set).
> >
> > I didn't get this either well :(
> > sorry, but do you mean CR3 for the guest or for the
> > processor? i hope you mean guest?
>
> Yes, scan the guest-CR3 to see where it placed the page-tables.
>
> >
> > Whenever a page-fault happens, the hypervisor has "first look", and
> > determines if the update is for a page-table or not. If it is a
> > page-table update, the guest operation is emulated (in
> x86_emulate.c),
> > and the result is written to the shadow-page-table AND the
> >
> > Why do we need emulation?some peculiar reason for emulating?
> > Do you mean to say if i am running a 32 bit domU on top of a
> > 64 bit processor, the guest operation for updating the page
> > table is emulated by the hypervisor.am i right?
>
> No, it's simply because we need to see the result of the
> instruction and
> write it to two places (with some modification in one of
> those places).
> So if the code is doing, for example: "*pte |= 1;" (set a
> page-table-entry to "present"), we need to mark both the
> guest-page-table-entry to "present", and mark our
> shadow-entry "present"
> (and perhaps do some other work too, but that's the minimum work
> needed).
>
> This brings one more question in my mind.Why do we use pinning then?
I believe there's two types of pinning! Page-pinning, which is blocking
a page from being accessed in an incorrect way [again, I'm not 100% sure
how this works, or exactly what it does - just that it's a term used in
the general way I described in the previous sentence].
> As i see at it.To avoid shadow page tables to be swapped out
> before the page tables they actually point to are swapped.Am i right?
>
> But according to interface manual,-> to bind a vcpu to a
> specific CPU in a SMP environment we use pining.But these two
> look pretty orthogonal statements to me, which means i may be
> wrong :(.
> Can somebody help me in this regard?
CPU pinning is to tie a VCPU to a (set of) processor(s). For example,
you may want to pin Dom0 to run only on CPU0, and pin a DomU to run on
CPU's 1,2 and 3. That way, Dom0 is ALWAYS able to run on it's own CPU,
and it's never in contention about which CPU to use, and DomU can run on
three CPU's as much as it likes. You could have another DomU pinned to
CPU 3 if you wish. That means that CPU 1, 2 are exclusively for the
first DomU, whilst the second DomU shares CPU3 with the first DomU (so
they both get half the CPU performance of one CPU - on average over a
reasonable amount of time).
--
Mats
>
> Pointers to actual code will be of great help.
>
> Thanks a lot Mats.
> Thank you all.
>
> --pradeep
> >
> > Does this means on a x86 platform this overkill or this
> > emulation is skipped altogether?
> > Please bear with me as i am an absolute Xen newbie out here :-).
>
> No, it's ALWAYS used for all page-table writes, as far as I
> understand.
>
> --
> Mats
> >
> > guest-page-table, but in the shadow-page-table, the value is
> > modified to
> > reflect the actual address in machine-space, rather than what
> > the guest
> > thinks it should be.
> >
> > In futuer versions of AMD processors (and I believe Intel are
> > working on
> > something very similar if not the same), there will be a mode
> > where the
> > processor is able to work in "nested paging mode", which means that
> > there are two "parallel" page-tables. The first one is the
> > "guest-page-table", the second one is the "host-page-table". In this
> > case, every lookup in the guest-page-table will be done through the
> > host-page-table. So we have a "simple" way to just take the
> > guest-page-table and translate it to machine-physical-address
> > - with the
> > good thing that the host-page-table needn't change, since the
> > pages that
> > the host consists of is pretty much static for the duration of the
> > guest.
> >
> > Yes, read about about this in an article mention how Pacifica
> > is better than VT.
> >
> > Say for example, we have a guest that lives at 256-512MB. The
> > guest-page-table would contain, for example, a mapping for
> > 0x12200000 ->
> > guest-physical 0x100000 (1MB). The host-page-table
> translates this to
> > 0x10100000 because the 1MB entry in guest-address is 256+1MB in
> > machine-address.
> >
> > Exactly, got this well on spot :).
> >
> > [In reality, it's very likely that the guest never gets all
> > the space in
> > one big chunk, but rather a few pages here and a few pages there. If
> > there are big chunks, we could use large pages to map those!].
> >
> > Thanks a ton Mats and all.
> >
> > --pradeep
> >
> > The support for nested paging (called HAP, Hardware Assisted
> > Paging) is
> > in the Unstable version of Xen since a few days back.
> >
> > --
> > Mats
> > >
> > > And this whole 2 level paging consitutes Xen's shadow page
> > > tables. Right?
> > >
> > > Is my understanding of Xen's paging mechanism correct?or am i
> > > missing something?
> > >
> > > Thank you
> > >
> > > -pradeep
> > >
> > >
> >
> >
> >
> >
> >
> >
>
>
>
>
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|