Matt Chapman wrote:
> 1. Packet arrives in a Dom0 SKB. Of course the buffer needs
> to be page sized/aligned (this is true on x86 too).
> 2. netback steals the buffer
> 3. netback donates it to DomU *without freeing it*
> 4. DomU receives the frame and passes it up its network stack
> 5. DomU gives away other frame(s) to restore balance
> 6. Dom0 eventually receives extra frames via its balloon driver
>
> 5 and 6 can be done lazily in batches. Alternatively, 4 and 5
> could be a single "flip" operation.
The solution will work with some tweaks. But is there any obvious benefit than
PMT approach used on x86? (If yes, you should suggest to xen-devel;-) Usually
we want a different approach for either "can't do on this architecture" or "far
better performance than existing one". Or else why we derail from Xen design
for extra maintainance effort. This extra effort has causing us 2+ weeks to
get VBD up to support DomU for the last 2 upstream merges
>
> I think this is not significantly different from x86.
>
> I'm not saying this is necessarily better than a PMT solution,
> but I want to discuss the differences and trade-offs. By PMT
> I assume you mean to make Dom0 not 1:1 mapped, and then give
> it access to the translation table? Can you describe how the
> above works differently with a PMT?
Simply saying the work flow, PMT approach is similar with backend/frontend
needed to touch PMT table for ownership change. However do you evaluate how
many tricky changes required to support Domain0 with gpn=mfn upon existing
code? For example,
- Backend drivers are not bound to dom0, which can also be used by domU
as driver domain. At that time, 1:1 mapping has no sense there. There are some
talks on DomU servers as driver IO already.
- You need ensure all available pages granted to dom0. That means you
need change current dom0 allocation code.
- You need to change current vnif code with - unknown - #ifdefs and
workarounds, since you implement a new behavior on top of different approach.
- ... (maintenance!)
So if you implement a VM from scratch, then definitely your approach is worthy
of trying since no limitation there. However since we work on XEN, we should
take advantage of current Xen design as possible, right? ;-)
>
> One disadvantage I see of having Dom0 not 1:1 is that superpages
> are more difficult, we can't just use the guest's superpages.
Superpages are optimization option, and we still need to support incontiguous
pages as a basic requirement. You can still add option to allocate contiguous
pages for guest even with PMT table, since para-virtualization is cooperative.
>
> Also, are there paravirtualisation changes needed to support a
> PMT? I'm concerned about not making the paravirtualisation
> changes too complex (I think x86 Xen changes the OS too much).
> Also, it should be possible to load Xen frontend drivers into
> unmodified OSs (on VT).
We need balance between new designs and maintainance effort. Currently Xiaofeng
Lin from Intel is working on para-drivers for unmodified domain, and both VBD &
VNIF are working for x86 VT domains already and are reviewing by Cambridge.
This work is based on PMT table.
Kevin
>
> On Mon, Oct 31, 2005 at 01:28:43PM +0800, Tian, Kevin wrote:
>> Hi, Matt,
>>
>> The point here is how to check donated frame done and where "free"
>> actually happens in domU. Currently Linux network driver utilizes
>> zero-copy to pass received packet up without any copy. In this case,
>> the receive pages are allocated from skbuff, which however is freed
>> by upper layer instead of vnif driver itself. To let dom0 know when
>> the donated page is done, you may either:
>> - Copy content from donated page to local skbuff page, and then
>> notify dom0 immediately at the cost of performance
>> - Modify upper layer code to register "free" hook which notify dom0
>> if done at the cost of more modification to common code and bias
>> from x86.
>>
>> Definitely there're other possibilities to make it "working" by
>> this approach and even more alternatives. However the point we
>> really want to emphasize here is that we can move towards x86
>> solution by adding PMT, with best performance and less maintenance
>> effort. That can actually minimize our future re-base effort when
>> para-drivers keep going. ;-)
>>
>> Thanks,
>> Kevin
>>
>>> -----Original Message-----
>>> From: Matt Chapman [mailto:matthewc@xxxxxxxxxxxxxxx]
>>> Sent: 2005年10月31日 13:09
>>> To: Tian, Kevin
>>> Cc: Dong, Eddie; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>> Subject: Re: [Xen-ia64-devel] Re: PMT table for XEN/IA64 (was: RE:
>>> Transparentparavirtualization vs. xen paravirtualization)
>>>
>>> Yes, I think I understand the problem now.
>>>
>>> The way I imagine this could work is that Dom0 would know about all
>>> of the memory in the machine (i.e. it would be passed the original
>>> EFI memmap, minus memory used by Xen).
>>>
>>> Then Dom0 would donate memory for other domains (=ballooning).
>>> Dom0 can donate data frames to DomU in the same way - by granting
>>> the frame and not freeing it. When DomU donates a data frame to
>>> Dom0, Dom0 frees it when it is done, and now the kernel can use it.
>>>
>>> What do you think of this approach?
>>>
>>> Matt
>>>
>>>
>>> On Mon, Oct 31, 2005 at 11:09:04AM +0800, Tian, Kevin wrote:
>>>> Hi, Matt,
>>>> It's not related to mapped virtual address, but only for
>>>> physical/machine pfn.
>>> Current vnif backend (on x86) works as:
>>>>
>>>> 1. Allocate a set of physical pfns from kernel
>>>> 2. chop up the mapping between physical pfn and old machine pfn
>>>> 3. Transfer ownership of old machine pfn to frontend
>>>> 4. Allocate new machine pfn and bound to that physical pfn
>>>> (In this case, there's no ownership return from frontend for
>>>> performance reason)
>>>>
>>>> If without PMT table (Assuming guest==machine for dom0), that
>>>> means you
>>> have to hotplug physical pfns from guest (based on page
>>> granularity) based on current vnif model. Or maybe you have better
>>> alternative without PMT, and without big change to existing vnif
>>> driver simultaneously?
>>>>
>>>> Thanks,
>>>> Kevin
>>>>
>>>>> -----Original Message-----
>>>>> From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>>>> [mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of
>>>>> Matt Chapman Sent: 2005年10月31日 10:59 To: Dong, Eddie
>>>>> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>>>> Subject: [Xen-ia64-devel] Re: PMT table for XEN/IA64 (was: RE:
>>>>> Transparentparavirtualization vs. xen paravirtualization)
>>>>>
>>>>> Hi Eddie,
>>>>>
>>>>> The way I did it was to make the address argument to grant
>>>>> hypercalls in/out; that is, the hypervisor might possibly return
>>>>> a different address than the one requested, like mmap on UNIX.
>>>>>
>>>>> For DomU, the hypervisor would map the page at the requested
>>>>> address. For Dom0, the hypervisor would instead return the
>>>>> existing address of that page, since Dom0 already has access
>>>>> to the whole address space.
>>>>>
>>>>> (N.B. I'm referring to physical/machine mappings here; unlike
>>>>> the x86 implementation where the grant table ops map pages
>>>>> directly into virtual address space).
>>>>>
>>>>> Matt
>>>>>
>>>>>
>>>>> On Fri, Oct 28, 2005 at 10:28:08PM +0800, Dong, Eddie wrote:
>>>>>>> Page flipping should work just fine
>>>>>>> in the current design; Matt had it almost working (out of tree)
>>>>>>> before he went back to school.
>>>>>>>
>>>>>> Matt:
>>>>>> Dan mentioned that you had VNIF work almost done without PMT
>>>>>> table support for dom0, Can you share the idea with us?
>>>>>> Usually VNIF swap page between dom0 and domU so that network
>>>>>> package copy (between dom0 native driver and domU frontend
>>>>>> driver) can be avoided and thus achieve high performance. With
>>>>>> this swap, we can no longer assume dom0 gpn=mfn. So what did you
>>>>>> ever propose to port VNIF without PMT table? Thanks a lot, eddie
>>>>>
>>>>> _______________________________________________
>>>>> Xen-ia64-devel mailing list
>>>>> Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>>>> http://lists.xensource.com/xen-ia64-devel
>
> _______________________________________________
> Xen-ia64-devel mailing list
> Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-ia64-devel
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
|