WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] x86_32: spurious page faults in guest GDT area

To: Jan Beulich <jbeulich@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] x86_32: spurious page faults in guest GDT area
From: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date: Mon, 16 Jun 2008 11:41:27 +0100
Delivery-date: Mon, 16 Jun 2008 03:42:10 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <48565D40.76E4.0078.0@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcjPnYd9xeC76DuQEd2lAgAX8io7RQ==
Thread-topic: [Xen-devel] x86_32: spurious page faults in guest GDT area
User-agent: Microsoft-Entourage/11.4.0.080122
What's the #PF error code -- is it a not-present or an access-violation
fault; read/write access; etc?

Do these faults happen under stable workload (by which I mean no domains
being created/destroyed -- all VMs are booted and just running normal kinds
of stuff)?

 -- Keir

On 16/6/08 11:32, "Jan Beulich" <jbeulich@xxxxxxxxxx> wrote:

> While under long-during stress I can reproduce this issue back to at least
> c/s 16084, in older change sets it was apparently so rare that during
> normal work/testing I never noticed it or had to ignore it due to not being
> re-creatable. However, on recent change sets (tested with our 2.6.25-
> based kernels only so far) it happens much more frequently (and
> occasionally even while the machine boots).
> 
> I inserted selector validation code in the context switch path to verify
> that a vcpu's selectors are okay (or better, that the guest-provided
> part of the GDT is accessible). These checks never indicated a failure
> so far.
> 
> The faults may happen in various places (hypervisor exit path as well
> as guest code), and always involve loading a selector register with a
> guest defined value (i.e. in the first page of the GDT). A page walk
> in the (hypervisor) fault handler shows that all levels of the translation
> exist (and are valid/consistent), and instrumentation of the selector
> manipulation functions shows that none of them get called spuriously.
> 
> Hence I can only suspect some asynchronous page table manipulation
> (but I'm not aware of anything like that) lacking proper TLB flushing, or
> some very rare issue with the CR3 reloading code.
> 
> The same 32-bit kernel used with a 64-bit hypervisor so far did not
> show similar problems - while I first thought this would help narrow
> the problem, I'm pretty clueless at this point because the candidate
> areas where 32-bit code is different from 64-bit all don't look
> troublesome to me (most notably TLB flushing is identical between
> the two).
> 
> Any ideas on how to narrow the problem would be appreciated.
> Thanks, Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>