This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Re: Fine-grained proxy resource charging

To: Lucy Cherkasova <lucy@xxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] Re: Fine-grained proxy resource charging
From: John L Griffin <jlg@xxxxxxxxxx>
Date: Tue, 23 Aug 2005 16:36:37 -0400
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Wed, 24 Aug 2005 09:05:11 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <200508231916.MAA09719@xxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Howdy Lucy, and thanks for the detailed reply.

> I/O bandwidth is a natural extension for resource accounting and can
> be addressed as well.

Agreed.  An interesting extension of this example: what if a non-I/O 
privileged service domain B (say, a domain that provides a object 
encryption service) must sometimes perform some network I/O through a 
third domain (say, downloading a new encryption algorithm from an external 
source) in order to continue servicing A's requests?  In this case, it 
will need to be possible for B to transitively specify on whose behalf the 
third-party work is being performed.

> We believe that the amount of memory page exchanges (between A and B,
> C and B, etc.) is a relatively accurate "hint" for splitting the CPU
> overhead in B with respect to A, C, etc.

I do like the simplicity of this approach -- in fact, I had exactly the 
same thought about using grant table operations to approximate proxy CPU 
consumption.  (Imagine my surprise when I opened up Login magazine and 
found the summary of the Usenix talks...)

However, I'd already rejected the idea, and decided to concentrate on 
alternate/finer-grained/internal-to-B approaches, for several reasons:

- Preventing DoS.  The count-page-exchanges scheme is good for CPU 
accounting, especially since it can be done with an unmodified B. However, 
in the context of actively rate-limiting A's resource consumption (perhaps 
by delaying or failing the page exchanges) it wouldn't work, since by the 
time B invokes a grant table operation it's too late -- B has already been 

- Non-correlation of grant table operations to resource consumption.  I 
don't have any empirical evidence for this, but I envision "service 
domain" scenarios like the one above, where there may be very few page 
exchanges and yet wildly different amounts of work performed. 

- Overcharging (as in my previous message).  I don't have any evidence for 
this either, but my first thought was that it might be an overpessimistic 
assumption to allocate 100% of B's resource usage to its clients.

- Transitivity (as in my example above).

Regardless, I agree with Rob's earlier point -- even a coarse-grained 
solution is better than nothing.  Perhaps the most interesting result from 
your work is just how substantial B's proxy CPU usage can be (and its 
effect on provisioning a system), and how we/the Xen community should 
perhaps focus on making that more efficient.

> The problem gets
> much harder and more complex when there are different drivers hosted
> by the same driver domain.

Could anybody comment on the current status of breaking dom0 into multiple 
single-function service domains, and/or not having a driver domain hosting 
multiple drivers?  I'm not caught up on Xen's current events, pun 

> Yes, we are also looking at how this overhead can be taken into
> account during the CPU scheduling for making a smarter resource
> allocation decision. The trade off here seems to be how one can
> enforce such decisions: either via a new scheduling policy (requires
> changes to Xen) or via changing the next period resource allocation
> from the outside of Xen based on the previous usage (one can use xm
> bvt .... or xm sedf facility for changing the allocation). It might
> depend on the targeted granularity of resource allocation decisions.

Actually, my thought was the scheduling that happens inside the domain 
(such as B's selection of which network packet it will process next).  I'd 
started to brainstorm about doing different scheduling at the 
Xen-scheduling level, but couldn't come up with any ideas that didn't have 
the potential of adversely affecting non-A domains.  Maybe it'd be 
possible to split a driver domain into a group of cooperating 
mini-domains, that collectively accomplish the same purpose but are 
independently scheduled?  Each mini-domain would service exactly one 
A-type domain.  (How bad are context switches in ring 1...maybe not too 
bad, so the main challenge would be architecting the mini-domains.)  I'd 
be happy to join in (from a distance) on any continuing whiteboard 
discussions you have along these lines.

> The other interesting question here is to provide some kind of
> performance isolation: for example, limiting the impact of the
> excessive traffic to one domain (say A) and its related overhead in
> driver domain (B)  on performance of the other domains.

Agreed!  This is exactly along the lines of what I was thinking about 
having Xen expose A's resource usage counts to B -- with the hope of 
allowing B (by scheduling or some other mechanism) to cut A off for 


Xen-devel mailing list