On 05/31/2010 02:42 AM, Rafal Wojtczuk wrote:
> Hello,
>
>> I would be grateful for the comments on possible methods to improve domain
>> restore performance. Focusing on the PV case, if it matters.
>>
> Continuing the topic; thank you to everyone that responded so far.
>
> Focusing on xen-3.4.3 case for now, dom0/domU still 2.6.32.x pvops x86_64.
> Let me just reiterate that for our purposes, the domain save time (and
> possible related post-processing) is not critical, it
> is only the restore time that matters. I did some experiments; they involve:
> 1) before saving a domain, have domU allocate all free memory in an userland
> process, then fill it with some MAGIC_PATTERN. Save domU, then process the
> savefile, removing all pfns (and their page content) that refer to a page
> containing MAGIC_PATTERN.
> This reduces the savefile size.
>
Why not just balloon the domain down?
> 2) instead of executing "xm restore savefile", just poke the xmlrpc request
> to Xend unix socket via socat
>
I would seek alternatives to the xend/xm toolset. I've been doing my
bit to make libxenlight/xl useful, though it still needs a lot of work
to get it to anything remotely production-ready...
> 3) change the /etc/xen/scripts/block so that in the "add file:" case, it calls
> only 3 processes (xenstore-read, losetup, xenstore-write); assuming the
> sharing check can be done elsewhere, this should provide realistic lower
> bound for the execution time
>
> For a domain with 400MB RAM and 4 vbds, with the savefile in the fs cache,
> this cuts down the restore real time from 2700 ms to 1153 ms. Some questions:
> a) is the 1) method safe ? Normally, xc_domain_restore() allocates mfns via
> xc_domain_memory_populate_physmap() and then calls
> xc_add_mmu_update(MMU_MACHPHYS_UPDATE) on
> the pfn/mfn pairs. If we remove some pfns from the savefile, this will not
> happen. Instead, the mfn for the removed pfn (referring to memory whose
> content we don't care for) will be allocated in uncanonicalize_pagetable(),
> because there will be a pte entry for this page. But
> uncanonicalize_pagetable()
> does not call xc_add_mmu_update(). Still, the domain seems to be restored
> properly (naturally the buffer filled previously with MAGIC_PATTERN now
> contains junk, but this is the whole purpose of it).
> Again, is xc_add_mmu_update(MMU_MACHPHYS_UPDATE) really needed in the above
> scenario ? It basically does
> set_gpfn_from_mfn(mfn, gpfn)
> but this should already be taken care for by
> xc_domain_memory_populate_physmap() ?
>
> b) There still seems to be some discrepancy between the real time (1153ms) and
> the CPU time (970ms); considering this is a machine with 2 cores (and at
> least the hotplug scripts execute in parallel), it is notable. What can cause
> the involved processes to sleep (we read the savefile from fs cache, so there
> should be no disk reads at all). Is the single threaded nature of xenstored
> the possible cause for the delays ?
>
Have you tried oxenstored? It works well for me, and seems to be a lot
faster.
> Generally xenstored seems to be quite busy during the restore. Do you think
> some of the queries (from Xend?) are redundant ? Is there anything else
> that can be removed from the relevant Xend code with no harm ? This question
> may sound too blunt; but given the fact that "xm restore savefile" wastes 220
> ms of CPU time doing apparently nothing useful, I would assume there is some
> overhead in Xend too.
> The systemtap trace in the attachment; it does not contain a line about the
> xenstored CPU ticks (259ms, really a lot?), as xenstored does not terminate
> any thread.
>
> c)
>
>>> Also, it looks really excessive that basically copying 400MB of memory takes
>>> over 1.3s cpu time. Is IOCTL_PRIVCMD_MMAPBATCH the culprit (its
>>>
>> I would expect IOCTL_PRIVCMD_MMAPBATCH to be the most significant part of
>> that loop.
>>
> Let's imagine there is a hypercall do_direct_memcpy_from_dom0_to_mfn(int
> mfn_count, mfn* mfn_array, char * pages_content).
> Would it make xc_restore faster if instead of using the xc_map_foreign_batch()
> interface, it would call the above hypercall ? On x86_64 all the physical
> memory is already mapped in the hypervisor (is this correct?), so this could
> be quicker, as no page table setup would be necessary ?
>
The main cost of pagetable manipulations is the tlb flush; if you can
batch all your setups together to amortize the cost of the tlb flush, it
should be pretty quick. But if batching is not being used properly,
then it could get very expensive. My own observation of "strace xl
restore" is that it seems to do a *lot* of ioctls on privcmd, but I
haven't looked more closely to see what those calls are, and whether
they're being done in an optimal way.
J
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|