Matt:
Yes, like you mentioned, let domU or VTIdomain only do page flipping
with assumption of the service domain own whole system pages (i.e all other
domain's page comes from service domain) works. While it is eventually
impossible for driver domains as there can be only one domain that own whole
system pages. So either we start with what you proposed, and roll back to what
X86 is doing now at some time later for example Xen3.1, or we start with align
to Xen/X86 and save all various maintaince effort and rework effort. I suggest
we go with the right design it will be eventually.
Yes, supporting PMT may require modification in Xenia64Linux, while as
you pointed out, domU in any sense (migration, memory location etc.) has to
maintain PMT table, so why not let dom0 work in same way? Let dom0 and domU use
as much code as possible is a right way to do IMO, right?
The modification to Xenia64Linux is not so big, probably only PMT setup
now, and then VBD/VNIF work may reference and modify it. It should be almost
same with X86 approach.
What sepcific question about X86 shadow_translate? I can consult expert
here too if you need :-)
So, now it may be time for us to dig into details of how to do
PMTs...:-) And dan?
Eddie
Matt Chapman wrote:
> I'm still not clear about the details. Could you outline the changes
> that you want to make to Xen/ia64?
>
> Would DomU have a PMT? Surely DomU should not know about real machine
> addresses, that should be hidden behind the grant table interface.
> Otherwise migration, save/restore, etc. are difficult (as they have
> found on x86).
>
> Do you know how x86 shadow_translate mode works? Perhaps we should
> use that as an example.
>
> Matt
>
>
> On Mon, Oct 31, 2005 at 05:11:09PM +0800, Tian, Kevin wrote:
>> Matt Chapman wrote:
>>> 1. Packet arrives in a Dom0 SKB. Of course the buffer needs
>>> to be page sized/aligned (this is true on x86 too).
>>> 2. netback steals the buffer
>>> 3. netback donates it to DomU *without freeing it*
>>> 4. DomU receives the frame and passes it up its network stack
>>> 5. DomU gives away other frame(s) to restore balance
>>> 6. Dom0 eventually receives extra frames via its balloon driver
>>>
>>> 5 and 6 can be done lazily in batches. Alternatively, 4 and 5
>>> could be a single "flip" operation.
>>
>> The solution will work with some tweaks. But is there any obvious
>> benefit than PMT approach used on x86? (If yes, you should suggest
>> to xen-devel;-) Usually we want a different approach for either
>> "can't do on this architecture" or "far better performance than
>> existing one". Or else why we derail from Xen design for extra
>> maintainance effort. This extra effort has causing us 2+ weeks to
>> get VBD up to support DomU for the last 2 upstream merges
>>
>>>
>>> I think this is not significantly different from x86.
>>>
>>> I'm not saying this is necessarily better than a PMT solution,
>>> but I want to discuss the differences and trade-offs. By PMT
>>> I assume you mean to make Dom0 not 1:1 mapped, and then give
>>> it access to the translation table? Can you describe how the
>>> above works differently with a PMT?
>>
>>
>> Simply saying the work flow, PMT approach is similar with
>> backend/frontend needed to touch PMT table for ownership change.
>> However do you evaluate how many tricky changes required to support
>> Domain0 with gpn=mfn upon existing code? For example,
>> - Backend drivers are not bound to dom0, which can also be used by
>> domU as driver domain. At that time, 1:1 mapping has no sense there.
>> There are some talks on DomU servers as driver IO already.
>> - You need ensure all available pages granted to dom0. That means
>> you need change current dom0 allocation code.
>> - You need to change current vnif code with - unknown - #ifdefs and
>> workarounds, since you implement a new behavior on top of different
>> approach.
>> - ... (maintenance!)
>>
>> So if you implement a VM from scratch, then definitely your approach
>> is worthy of trying since no limitation there. However since we work
>> on XEN, we should take advantage of current Xen design as possible,
>> right? ;-)
>>
>>>
>>> One disadvantage I see of having Dom0 not 1:1 is that superpages
>>> are more difficult, we can't just use the guest's superpages.
>>
>>
>> Superpages are optimization option, and we still need to support
>> incontiguous pages as a basic requirement. You can still add option
>> to allocate contiguous pages for guest even with PMT table, since
>> para-virtualization is cooperative.
>>
>>>
>>> Also, are there paravirtualisation changes needed to support a
>>> PMT? I'm concerned about not making the paravirtualisation
>>> changes too complex (I think x86 Xen changes the OS too much).
>>> Also, it should be possible to load Xen frontend drivers into
>>> unmodified OSs (on VT).
>>
>>
>> We need balance between new designs and maintainance effort.
>> Currently Xiaofeng Lin from Intel is working on para-drivers for
>> unmodified domain, and both VBD & VNIF are working for x86 VT
>> domains already and are reviewing by Cambridge. This work is based
>> on PMT table.
>>
>> Kevin
>>>
>>> On Mon, Oct 31, 2005 at 01:28:43PM +0800, Tian, Kevin wrote:
>>>> Hi, Matt,
>>>>
>>>> The point here is how to check donated frame done and where "free"
>>>> actually happens in domU. Currently Linux network driver utilizes
>>>> zero-copy to pass received packet up without any copy. In this
>>>> case, the receive pages are allocated from skbuff, which however
>>>> is freed by upper layer instead of vnif driver itself. To let dom0
>>>> know when the donated page is done, you may either:
>>>> - Copy content from donated page to local skbuff page, and then
>>>> notify dom0 immediately at the cost of performance
>>>> - Modify upper layer code to register "free" hook which notify
>>>> dom0 if done at the cost of more modification to common code and
>>>> bias from x86.
>>>>
>>>> Definitely there're other possibilities to make it "working" by
>>>> this approach and even more alternatives. However the point we
>>>> really want to emphasize here is that we can move towards x86
>>>> solution by adding PMT, with best performance and less maintenance
>>>> effort. That can actually minimize our future re-base effort when
>>>> para-drivers keep going. ;-)
>>>>
>>>> Thanks,
>>>> Kevin
>>>>
>>>>> -----Original Message-----
>>>>> From: Matt Chapman [mailto:matthewc@xxxxxxxxxxxxxxx]
>>>>> Sent: 2005年10月31日 13:09
>>>>> To: Tian, Kevin
>>>>> Cc: Dong, Eddie; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>>>> Subject: Re: [Xen-ia64-devel] Re: PMT table for XEN/IA64 (was: RE:
>>>>> Transparentparavirtualization vs. xen paravirtualization)
>>>>>
>>>>> Yes, I think I understand the problem now.
>>>>>
>>>>> The way I imagine this could work is that Dom0 would know about
>>>>> all of the memory in the machine (i.e. it would be passed the
>>>>> original EFI memmap, minus memory used by Xen).
>>>>>
>>>>> Then Dom0 would donate memory for other domains (=ballooning).
>>>>> Dom0 can donate data frames to DomU in the same way - by granting
>>>>> the frame and not freeing it. When DomU donates a data frame to
>>>>> Dom0, Dom0 frees it when it is done, and now the kernel can use
>>>>> it.
>>>>>
>>>>> What do you think of this approach?
>>>>>
>>>>> Matt
>>>>>
>>>>>
>>>>> On Mon, Oct 31, 2005 at 11:09:04AM +0800, Tian, Kevin wrote:
>>>>>> Hi, Matt,
>>>>>> It's not related to mapped virtual address, but only for
>>>>>> physical/machine pfn.
>>>>> Current vnif backend (on x86) works as:
>>>>>>
>>>>>> 1. Allocate a set of physical pfns from kernel
>>>>>> 2. chop up the mapping between physical pfn and old machine pfn
>>>>>> 3. Transfer ownership of old machine pfn to frontend
>>>>>> 4. Allocate new machine pfn and bound to that physical pfn
>>>>>> (In this case, there's no ownership return from frontend for
>>>>>> performance reason)
>>>>>>
>>>>>> If without PMT table (Assuming guest==machine for dom0), that
>>>>>> means you
>>>>> have to hotplug physical pfns from guest (based on page
>>>>> granularity) based on current vnif model. Or maybe you have better
>>>>> alternative without PMT, and without big change to existing vnif
>>>>> driver simultaneously?
>>>>>>
>>>>>> Thanks,
>>>>>> Kevin
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>>>>>> [mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of
>>>>>>> Matt Chapman Sent: 2005年10月31日 10:59 To: Dong, Eddie
>>>>>>> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>>>>>> Subject: [Xen-ia64-devel] Re: PMT table for XEN/IA64 (was: RE:
>>>>>>> Transparentparavirtualization vs. xen paravirtualization)
>>>>>>>
>>>>>>> Hi Eddie,
>>>>>>>
>>>>>>> The way I did it was to make the address argument to grant
>>>>>>> hypercalls in/out; that is, the hypervisor might possibly return
>>>>>>> a different address than the one requested, like mmap on UNIX.
>>>>>>>
>>>>>>> For DomU, the hypervisor would map the page at the requested
>>>>>>> address. For Dom0, the hypervisor would instead return the
>>>>>>> existing address of that page, since Dom0 already has access
>>>>>>> to the whole address space.
>>>>>>>
>>>>>>> (N.B. I'm referring to physical/machine mappings here; unlike
>>>>>>> the x86 implementation where the grant table ops map pages
>>>>>>> directly into virtual address space).
>>>>>>>
>>>>>>> Matt
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Oct 28, 2005 at 10:28:08PM +0800, Dong, Eddie wrote:
>>>>>>>>> Page flipping should work just fine
>>>>>>>>> in the current design; Matt had it almost working (out of
>>>>>>>>> tree) before he went back to school.
>>>>>>>>>
>>>>>>>> Matt:
>>>>>>>> Dan mentioned that you had VNIF work almost done without PMT
>>>>>>>> table support for dom0, Can you share the idea with us?
>>>>>>>> Usually VNIF swap page between dom0 and domU so that network
>>>>>>>> package copy (between dom0 native driver and domU frontend
>>>>>>>> driver) can be avoided and thus achieve high performance. With
>>>>>>>> this swap, we can no longer assume dom0 gpn=mfn. So what did
>>>>>>>> you ever propose to port VNIF without PMT table? Thanks a
>>>>>>>> lot, eddie
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Xen-ia64-devel mailing list
>>>>>>> Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>>>>>> http://lists.xensource.com/xen-ia64-devel
>>>
>>> _______________________________________________
>>> Xen-ia64-devel mailing list
>>> Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>> http://lists.xensource.com/xen-ia64-devel
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
|