WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: mem-event interface

To: "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>, george.dunlap@xxxxxxxxxxxxx, Andrew Peace <Andrew.Peace@xxxxxxxxxxxxx>, Steven Hand <Steven.Hand@xxxxxxxxxxxxx>, Patrick Colp <pjcolp@xxxxxxxxx>, "Bryan D. Payne" <bryan@xxxxxxxxxxxx>
Subject: [Xen-devel] Re: mem-event interface
From: Grzegorz Milos <grzegorz.milos@xxxxxxxxx>
Date: Wed, 23 Jun 2010 23:24:46 +0100
Cc:
Delivery-date: Wed, 23 Jun 2010 15:42:32 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=1JeNk9EcpdBMjtZI9uYrh+srFwIGHE4mg/xq0D4yhZ0=; b=XTKR9airZomxfRbigzCnjm8JGAvDBfq6aEAUJg24Xm6mnIh8pw8RSm1b3d8cJ7r37W 3jxbEy7uQ7MrXn+68lAl15fPMJQEY86EjN3fOfnNtoMPfsM6LZsjrg4fxXYb2ihD53JX X1pQw0wTuEv0B6hiEzNemjbKCDzJlRxZDUQKU=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=YdFDVFxR/JosfGDVsqdRLUvDkQT/X95zf8bvFcfEhUUfKj2mRRj16svsCvhjkMIrAx p7iuB7t5HmmMJTnprJmFpu4OeXXAezjaYvMlR+BJ+QdoUfpfCYpg5W7YrQUmgom1lZTm tQ7Dimlo8YEEbEMfJXp+pZfmUULkF6snKGeXM=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTinJ736hCNDhYDeZBZTTuMwKH8x265NoWVPYoco9@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTinovEV47G8aIw-Na_iQhELjUG3R7VUlAzf1mD66@xxxxxxxxxxxxxx> <AANLkTimW9Q3ro-egPYj2zz6B3d20lgxz_syh8oiVrzNR@xxxxxxxxxxxxxx> <AANLkTikydcTHLFmPNx_5kTsSYq7BM_u_l54tB3f3H_iT@xxxxxxxxxxxxxx> <AANLkTilKT2LoowaY_7RB-hKTTUMkhC_lLTohmohozKeH@xxxxxxxxxxxxxx> <AANLkTikTBCAj9gCM-Ksk2f85tGfySKfqcdI7Ojy9LU30@xxxxxxxxxxxxxx> <AANLkTimNnlittGra6kk3VWvDcWIRdw2HRBWnb3b2l3R-@xxxxxxxxxxxxxx> <AANLkTinwajCCxk8LtzfR6CwBqk-FZP4HvmsxMEHWthdn@xxxxxxxxxxxxxx> <AANLkTilJp4OPyG07RFpqYjIVclkaqXIWEG-ZUU_CNJTU@xxxxxxxxxxxxxx> <AANLkTinJ736hCNDhYDeZBZTTuMwKH8x265NoWVPYoco9@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
[From Bryan]

> I guess I'm more envisioning integrating all this with libxc and
> having XenAccess et al. use that. Keeping it as a separate, VM
> introspection library makes sense too. In any case, I think having
> XenAccess as part of Xen is a good move. VM introspection is a useful
> thing to have and I think a lot of projects could benefit from it.

>From my experience, the address translations can actually be pretty
tricky.  This is a big chunk of what XenAccess does, and it requires
some memory analysis in the domU to find necessary page tables and
such.  So it may be more than you really want to add to libxc.  But if
you go down this route, then I could certainly simplify the XenAccess
code, so I wouldn't complain about that :-)

-bryan

On Wed, Jun 23, 2010 at 11:24 PM, Grzegorz Milos
<grzegorz.milos@xxxxxxxxx> wrote:
> [From Patrick]
>
> I guess I'm more envisioning integrating all this with libxc and
> having XenAccess et al. use that. Keeping it as a separate, VM
> introspection library makes sense too. In any case, I think having
> XenAccess as part of Xen is a good move. VM introspection is a useful
> thing to have and I think a lot of projects could benefit from it.
>
>
> Patrick
>
> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
> <grzegorz.milos@xxxxxxxxx> wrote:
>> [From Bryan]
>>
>>> XenAccess, but how feasible is it to even move some of the gva/pfn/mfn
>>> translation code out into the library and have the mem_event daemon
>>> use that? I do remember reading through and borrowing XenAccess code
>>
>> This is certainly doable.  But if we decide to make a Xen library
>> depend on XenAccess, then it would make sense to include XenAccess as
>> part of the Xen distribution, IMHO.  This probably isn't too
>> unreasonable to consider, but we'd want to make sure that the
>> XenAccess configuration is either simplified or eliminated to avoid
>> causing headaches for the average person using this stuff.  Something
>> to think about...
>>
>> -bryan
>>
>> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
>> <grzegorz.milos@xxxxxxxxx> wrote:
>>> [From Patrick]
>>>
>>>> I like this idea as it keeps Xen as simple as possible and should also
>>>> help to reduce the number of notifications sent from Xen up to user
>>>> space (e.g., one notification to the daemon could then be pushed out
>>>> to multiple clients that care about it).
>>>
>>> Yeah, that was my general thinking as well. So the immediate change to
>>> the mem_event interface for this would be a way to specify sub-page
>>> level stuff. The best way to approach this is probably by specifying a
>>> start and end range (or more likely start address and size). This way
>>> things like swapping and sharing would specify the start address of
>>> the page they're interested in and PAGE_SIZE (or, more realistically
>>> there would be an additional lib call to do page-level stuff, which
>>> would just take the pfn and do this translation under the hood).
>>>
>>>
>>>> For what it's worth, I'd be happy to build such a daemon into
>>>> XenAccess.  This may be a logical place for it since XenAccess is
>>>> already doing address translations and such, so it would be easier for
>>>> a client app to specify an address range of interest as a virtual
>>>> address or physical address.  This would prevent the need to repeat
>>>> some of that address translation functionality in yet another library.
>>>>
>>>> Alternatively, we could provide the daemon functionality in libxc or
>>>> some other Xen library and only provide support for low level
>>>> addresses (e.g., pfn + offset).  Then XenAccess could build on top of
>>>> that to offer higher level addresses (e.g., pa or va) using its
>>>> existing translation mechanisms.  This approach would more closely
>>>> mirror the current division of labor between XenAccess and libxc.
>>>
>>> This sounds good to me. I'd lean towards  the second approach as I
>>> think it's the better long-term solution. I'm a bit rusty on my
>>> XenAccess, but how feasible is it to even move some of the gva/pfn/mfn
>>> translation code out into the library and have the mem_event daemon
>>> use that? I do remember reading through and borrowing XenAccess code
>>> (or at least the general mechanism) to do address translation stuff
>>> for other projects, so it seems like having a general way to do that
>>> would be a win. I think I did it with the CoW stuff, which I actually
>>> want to port to the mem_event interface as well, both to have it
>>> available and as another example of neat things we can do with the
>>> interface.
>>>
>>>
>>> Patrick
>>>
>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
>>> <grzegorz.milos@xxxxxxxxx> wrote:
>>>> [From Bryan]
>>>>
>>>>> needs to know to do sync notification). What's everybody thoughts on
>>>>> this? Does it seem reasonable or have I gone completely mad?
>>>>
>>>> I like this idea as it keeps Xen as simple as possible and should also
>>>> help to reduce the number of notifications sent from Xen up to user
>>>> space (e.g., one notification to the daemon could then be pushed out
>>>> to multiple clients that care about it).
>>>>
>>>> For what it's worth, I'd be happy to build such a daemon into
>>>> XenAccess.  This may be a logical place for it since XenAccess is
>>>> already doing address translations and such, so it would be easier for
>>>> a client app to specify an address range of interest as a virtual
>>>> address or physical address.  This would prevent the need to repeat
>>>> some of that address translation functionality in yet another library.
>>>>
>>>> Alternatively, we could provide the daemon functionality in libxc or
>>>> some other Xen library and only provide support for low level
>>>> addresses (e.g., pfn + offset).  Then XenAccess could build on top of
>>>> that to offer higher level addresses (e.g., pa or va) using its
>>>> existing translation mechanisms.  This approach would more closely
>>>> mirror the current division of labor between XenAccess and libxc.
>>>>
>>>> -bryan
>>>>
>>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
>>>> <grzegorz.milos@xxxxxxxxx> wrote:
>>>>> [From Patrick]
>>>>>
>>>>>> Since I'm coming in the middle of this discussion, forgive me if I've
>>>>>> missed something.  But is the idea here to create a more general
>>>>>> interface that could support various different types of memory events
>>>>>> + notification?  And the two events listed below are just a subset of
>>>>>> the events that could / would be supported?
>>>>>
>>>>> That's correct.
>>>>>
>>>>>
>>>>>> In general, I like the sound of where this is going but I would like
>>>>>> to see support for notification of events such as when a domU reads /
>>>>>> writes / execs a pre-specified byte(s) of memory.  As such, there
>>>>>> would need to be a notification path (as discussed below) and also a
>>>>>> control path to setup the memory regions that the user app cares
>>>>>> about.
>>>>>
>>>>> Sub-page events is something I would like to have included as well.
>>>>> Currently the control path is basically just "nominating" a page (for
>>>>> either swapping or sharing). It's not entirely clear to me the best
>>>>> way to go about this. With swapping and sharing we have code in Xen to
>>>>> handle both cases. However, to just receive notifications (like
>>>>> "read", "write", "execute") I don't think we need specialised support
>>>>> (or at least just once to handle the notifications). I'm thinking it
>>>>> might be good to have a daemon to handle these events in user-space
>>>>> and register clients with the user-space daemon. Each client would get
>>>>> a unique client ID which could be used to identify who should get the
>>>>> response. This way, we could just register that somebody is interested
>>>>> in that page (or byte, etc) and let the user-space tool handle most of
>>>>> the complex logic (i.e. which of the clients should that particular
>>>>> notification go to). This requires some notion of priority for memory
>>>>> areas (e.g. if one client requests notification for access to a byte
>>>>> of page foo and another requests notification for access to any of
>>>>> page foo, then we only need Xen to store that it should notify for
>>>>> page foo and just send along which byte(s) of the page were accessed
>>>>> as well, then the user-space daemon can determine if both clients
>>>>> should be notified or just the one) (e.g. if one client requests async
>>>>> notification and another requests sync notification, then Xen only
>>>>> needs to know to do sync notification). What's everybody thoughts on
>>>>> this? Does it seem reasonable or have I gone completely mad?
>>>>>
>>>>>
>>>>> Patrick
>>>>>
>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>>>>> <grzegorz.milos@xxxxxxxxx> wrote:
>>>>>> [From Bryan]
>>>>>>
>>>>>> Bryan D. Payne
>>>>>>  to Patrick, me, george.dunlap, Andrew, Steven
>>>>>>
>>>>>> show details Jun 16 (7 days ago)
>>>>>>
>>>>>> Patrick, thanks for the inclusion.
>>>>>>
>>>>>> Since I'm coming in the middle of this discussion, forgive me if I've
>>>>>> missed something.  But is the idea here to create a more general
>>>>>> interface that could support various different types of memory events
>>>>>> + notification?  And the two events listed below are just a subset of
>>>>>> the events that could / would be supported?
>>>>>>
>>>>>> In general, I like the sound of where this is going but I would like
>>>>>> to see support for notification of events such as when a domU reads /
>>>>>> writes / execs a pre-specified byte(s) of memory.  As such, there
>>>>>> would need to be a notification path (as discussed below) and also a
>>>>>> control path to setup the memory regions that the user app cares
>>>>>> about.
>>>>>>
>>>>>> -bryan
>>>>>>
>>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>>>>>> <grzegorz.milos@xxxxxxxxx> wrote:
>>>>>>> [From Patrick]
>>>>>>>
>>>>>>> I think the idea of multiple rings is a good one. We'll register the
>>>>>>> clients in Xen and when an mem_event is reached, we can just iterate
>>>>>>> through the list of listeners to see who needs a notification.
>>>>>>>
>>>>>>> The person working on the anti-virus stuff is Bryan Payne from Georgia
>>>>>>> Tech. I've CCed him as well so we can get his input on this stuff as
>>>>>>> well. It's better to hash out a proper interface now rather than
>>>>>>> continually changing it around.
>>>>>>>
>>>>>>>
>>>>>>> Patrick
>>>>>>>
>>>>>>> On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz Milos
>>>>>>> <grzegorz.milos@xxxxxxxxx> wrote:
>>>>>>>> [From Gregor]
>>>>>>>>
>>>>>>>> There are two major events that the memory sharing code needs to
>>>>>>>> communicate over the hypervisor/userspace boundary:
>>>>>>>> 1. GFN unsharing failed due to lack of memory. This will be called the
>>>>>>>> 'OOM event' from now on.
>>>>>>>> 2. MFN is no longer sharable (actually an opaque sharing handle would
>>>>>>>> be communicated instead of the MFN). 'Handle invalidate event' from
>>>>>>>> now on.
>>>>>>>>
>>>>>>>> The requirements on the OOM event are relatively similar to the
>>>>>>>> page-in event. The way this should operate is that the faulting VCPU
>>>>>>>> is paused, and the pager is requested to free up some memory. When it
>>>>>>>> does so, it should generate an appropriate response, and wake up the
>>>>>>>> VCPU back again using a domctl. The event is going to be low volume,
>>>>>>>> and since it is going to be handled synchronously, likely in tens of
>>>>>>>> ms, there are no particular requirements on the efficiency.
>>>>>>>>
>>>>>>>> Handle invalidate event type is less important in the short term
>>>>>>>> because the userspace sharing daemon is designed to be resilient to
>>>>>>>> unfresh sharing state. However, if it is missing it will make the
>>>>>>>> sharing progressively less effective as time goes on. The idea is that
>>>>>>>> the hypervisor communicates which sharing handles are no longer valid,
>>>>>>>> such that the sharing daemon only attempts to share pages in the
>>>>>>>> correct state. This would be relatively high volume event, but it
>>>>>>>> doesn't need to be accurate (i.e. events can be dropped if they are
>>>>>>>> not consumed quickly enough). As such this event should be batch
>>>>>>>> delivered, in an asynchronous fashion.
>>>>>>>>
>>>>>>>> The OOM event is coded up in Xen, but it will not be consumed properly
>>>>>>>> in the pager. If I remember correctly, I didn't want to interfere with
>>>>>>>> the page-in events because the event interface assumed that mem-event
>>>>>>>> responses are inserted onto the ring in precisely the same order as
>>>>>>>> the requests. This may not be the case when we start mixing different
>>>>>>>> event types. WRT to the handle invalidation, the relevant hooks exist
>>>>>>>> in Xen, and in the mem sharing daemon, but there is no way to
>>>>>>>> communicate events to two different consumers atm.
>>>>>>>>
>>>>>>>> Since the requirements on the two different sharing event types are
>>>>>>>> substantially different, I think it may be easier if separate channels
>>>>>>>> (i.e. separate rings) were used to transfer them. This would also fix
>>>>>>>> the multiple consumers issue relatively easily. Of course you may know
>>>>>>>> of some other mem events that wouldn't fit in that scheme.
>>>>>>>>
>>>>>>>> I remember that there was someone working on an external anti-virus
>>>>>>>> software, which prompted the whole mem-event work. I don't remember
>>>>>>>> his/hers name or affiliation (could you remind me?), but maybe he/she
>>>>>>>> would be interested in working on some of this?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Gregor
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel