This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] [RFC] implement "trap-process-return"-like behavior with gra

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] [RFC] implement "trap-process-return"-like behavior with grant ring + evtchn
From: Wei Liu <liuw@xxxxxxxxx>
Date: Thu, 23 Jun 2011 15:21:02 +0800
Cc: Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
Delivery-date: Thu, 23 Jun 2011 00:22:23 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi, all

As you all know, I'm implementing pure-xen virtio. I mainly pay
attention to its transport layer. In hvm case, the transport layer is
implemented as virtual PCI. Now I'm about to replace this layer with
grant ring + evtchn implementation.

In the hvm case with virtual PCI transport layer, virtio works as followed:
1. guest writes to PCI cfg space to get trapped by hypervisor
2. backend dispatches and processes request
3. return to guest
This is a *synchronous* communication.

However, evtchn is by designed *asynchronous*, which means if we write
our requests to the ring and notify the other end, we have no idea
when it will get processed. This will cause race conditions. Say:

1. FE changes config: that is, write to configuration space, and
notify the other end via evtchn;
2. FE immediately reads configuration space, the change may not have
been acked or recorded by BE.
As a result, FE / BE states are inconsistent.

NOTE: Here by "configuration space" I mean the internal states of
devices, not limited to PCI configuration space.

Stefano and IanC suggest FE spin-wait for an answer from BE. I come up
with my rough design, I would like to ask you for advice.

This is how I would do it.

1. setup a evtchn (cfg-evtchn) and two grant rings (fe-to-be / be-to-fe).
2. zero-out two rings.

1. puts requests to fe-to-be ring, then notifies BE via cfg-evtchn.
2. spin waits for exactly one answer in be-to-fe ring, otherwise BUG().
3. consumes that answer and reset be-to-fe ring.

1. gets notified.
2. check if there is exactly one request in the ring, otherwise BUG().
3. consumes the request in fe-to-be ring.
4. writes back answer in be-to-fe ring.

As you can see, cfg-evtchn is only used when doing fe-to-be
notification. This allows BE to relax.

This is only a rough design, I haven't implemented it. If anyone has
better idea or spots any problem with this design, please let me know.
Your advice is always welcomed.


Xen-devel mailing list