This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Re: [PATCH] xen: reduce severity of message about using

To: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Subject: Re: [Xen-devel] Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
From: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
Date: Thu, 3 Dec 2009 10:28:44 +0000
Cc: Steven Smith <Steven.Smith@xxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Thu, 03 Dec 2009 02:29:05 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1259783682.31045.22.camel@xxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Citrix Systems, Inc.
References: <1259782098-32180-1-git-send-email-Ian.Campbell@xxxxxxxxxx> <4B16C103.9070105@xxxxxxxx> <1259783682.31045.22.camel@xxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Wed, 2009-12-02 at 19:54 +0000, Ian Campbell wrote:
> > Does it really need to be a panic?  Can't we just start failing all
> > future operations?  Seems bad to take out the whole machine if we
> can
> > just get away with crippling one device (especially if it can be
> > recovered by downing it and re-upping a new one with nc1 and/or
> gt1).
> Wouldn't there be (failing) grant table ops on the down path?
> In any case doesn't it effect all devices since they all use the same
> grant table? 

Oh, I see what you meant... in the proper resume case (as opposed to the
cancelled suspend/checkpoint case I was thinking of) there should be no
grant tables in use at this point so most devices should, in theory, be
able to reconnect using v1 grants, any drivers which require v2 grant
tables need to check for them in their resume hook as well as at start
of day.

Unfortunately frontend devices tear down their grant entries after the
resume rather than before the suspend (I presume this has to do with
faster checkpointing?) which means they could be trying to clear an
entry of the wrong layout, leading the unbounded badness that the
comment refers to.

I think the choices are basically:
      * Always latch to either v1 or v2 at start of day, if we can't get
        the version we want then panic (this is a stronger restriction
        than the current code which will try to upgrade to v2 on resume)
      * Write v1<->v2 layout transformations called on gnttab resume
        before the devices get a chance to try and unmap their old
        entries. Would need to handle v2 entries sing feature which are
        not expressible in v1.

I'm tempted to go with the former for simplicity, it enables migration
to a newer version of Xen (the guest will just keep using v1) but will
not allow migration back to an older version of Xen, which is not
something we generally support anyway.


Xen-devel mailing list