WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] Critical bug: VT-d fault causes disk corruption or Dom0

To: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "Kay, Allen M" <allen.m.kay@xxxxxxxxx>, "Li, Xin" <xin.li@xxxxxxxxx>, "Li, Haicheng" <haicheng.li@xxxxxxxxx>, "'xen-devel@xxxxxxxxxxxxxxxxxxx'" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Wang, Shane" <shane.wang@xxxxxxxxx>
Subject: RE: [Xen-devel] Critical bug: VT-d fault causes disk corruption or Dom0 kernel panic.
From: "Cihula, Joseph" <joseph.cihula@xxxxxxxxx>
Date: Fri, 23 Jan 2009 18:19:00 -0800
Accept-language: en-US
Acceptlanguage: en-US
Cc:
Delivery-date: Fri, 23 Jan 2009 18:19:37 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <C59FBFF6.21CC8%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <57C9024A16AD2D4C97DC78E552063EA35EED26A5@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <C59FBFF6.21CC8%keir.fraser@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acl8Q3YVs3niaMuxT1CPWgfSc0ZGCwAITqg5AAKyJXAAAOMrHgAfiHegABEJqTgAEqyPIAACkS3lAA9fDjA=
Thread-topic: [Xen-devel] Critical bug: VT-d fault causes disk corruption or Dom0 kernel panic.
> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx 
> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On
> Behalf Of Keir Fraser
> Sent: Friday, January 23, 2009 10:42 AM
>
> Ah, I know what it is! We actually free up bits of the Xen image at the end
> of Xen bootstrap, and these can now be allocated to a domain (e.g., dom0)
> and DMAed to. But these will be contained within the bounds of __pa(&_start)
> and __pa(&_end) and hence will not have been mapped in dom0'd vtd tables.
>
> Sadly the fact is that Xen relies on validity of memory from the domain heap
> as well as Xen heap anyway, so the avoidance of mapping Xen-critical memory
> in dom0 vtd tables is inadequate anyway, even on x86_32 and ia64.
>
> Also it's going to be hard to do better while keeping efficiency since if
> you only map dom0's pages in its vtd tables then PV backend drivers will not
> work (which rely on DMAing to/from other domain's pages via grant
> references). You'd have to dynamically map/unmap as grants get
> mapped/unmapped, and you may not want the performance hit of that.
>
> I'd personally vote for getting rid of xen_in_range(). Alternatively we
> could have it merely check for is_kernel_text(), but really I think since it
> is not in any way full protection from dom0 I wonder if it is worth the
> bother at all.
>
> What do you think?
>
>  -- Keir

Since this is somewhat similar to the issue I'm facing with the TXT patch, it 
does seem useful to have a good way of knowing where all of the hypervisor 
memory is.

I looked at is_kernel_text() and that only compares against _stext/_etext, 
which after looking at the xen.lds file, is really just some of the code of the 
hypervisor.  Is there any reason not to use [_stext, __init_begin) + 
[__per_cpu_start, __per_cpu_end] + [__bss_start, _end] + 
[bootsym_phys(trampoline_start), bootsym_phys(trampoline_end)] as a first 
approximation of hypervisor memory (I'm assuming that the code within 
[__init_begin, __init_end] is what you reclaim)?

While this still doesn't get the xen heap or domain heap, it at least gets us a 
little farther.

For the MAC aspect of the TXT patch, we need to know all of the code + data 
that could be used during resume and before the xen code that MACs everything 
else.  This includes the stack, page tables, etc.

We've also added a fn that checks the ACPI Sx addresses against xen memory 
(hypervisor + domain) to ensure that tboot can't be tricked into overwriting 
xen as part of S3.  This should be a more comprehensive check than for MAC, 
since there is no way of detecting if we missed some range.

Joe

>
> On 23/01/2009 17:30, "Kay, Allen M" <allen.m.kay@xxxxxxxxx> wrote:
>
> > I have not figured out why this is the problem yet but I know comment it out
> > makes the problem go away.  Leaving tboot_in_range() in does not cause this
> > problem.
> >
> > Allen
> >
> > -----Original Message-----
> > From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> > Sent: Friday, January 23, 2009 12:34 AM
> > To: Kay, Allen M; Li, Xin; Li, Haicheng; 'xen-devel@xxxxxxxxxxxxxxxxxxx'
> > Subject: Re: [Xen-devel] Critical bug: VT-d fault causes disk corruption or
> > Dom0 kernel panic.
> >
> > Are you sure that is the problem? The xen_in_range() change should make the
> > dom0 VT-d table more permissive, and hence if anything less likely to
> > experience VT-d faults. Also it wouldn't seem to explain problems for HVM
> > guest passthrough.
> >
> >  -- Keir
> >
> > On 23/01/2009 01:01, "Kay, Allen M" <allen.m.kay@xxxxxxxxx> wrote:
> >
> >> Looks like the problem is caused by xen_in_range() call in
> >> vtd/iommu.c/intel_iommu_domain_init().  Definition of xen_in_range() was
> >> changed as part of the heap patch.
> >>
> >> I'm looking into change intel_iommu_domain_init() to just map pages in
> >> dom0->page_list.  However this looks to be more complicated as d->page_list
> >> is
> >> not initialized at this stage of the boot yet.
> >>
> >> Allen
> >>
> >> -----Original Message-----
> >> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> >> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Keir Fraser
> >> Sent: Thursday, January 22, 2009 1:23 AM
> >> To: Li, Xin; Li, Haicheng; 'xen-devel@xxxxxxxxxxxxxxxxxxx'
> >> Subject: Re: [Xen-devel] Critical bug: VT-d fault causes disk corruption or
> >> Dom0 kernel panic.
> >>
> >> Mmm well not really. :-)
> >>
> >> Is there any assumption in the VT-d setup about preventing access to the 
> >> Xen
> >> heap, and could that be broken?
> >>
> >> Perhaps the VT-d pagetables are broken causing bad DMAs leading to data
> >> corruption and bad command packets?
> >>
> >>  -- Keir
> >>
> >> On 22/01/2009 08:58, "Li, Xin" <xin.li@xxxxxxxxx> wrote:
> >>
> >>> We are looking into the issue too. If you have any idea on how it's 
> >>> caused,
> >>> please tell us :-)
> >>> Thanks!
> >>> -Xin
> >>>
> >>>> -----Original Message-----
> >>>> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> >>>> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Keir Fraser
> >>>> Sent: Thursday, January 22, 2009 3:40 PM
> >>>> To: Li, Haicheng; 'xen-devel@xxxxxxxxxxxxxxxxxxx'
> >>>> Subject: Re: [Xen-devel] Critical bug: VT-d fault causes disk corruption 
> >>>> or
> >>>> Dom0
> >>>> kernel panic.
> >>>>
> >>>> Thanks,
> >>>>
> >>>> I haven't seen any problems outside of VT-d since c/s 19057, btw.
> >>>>
> >>>> -- Keir
> >>>>
> >>>> On 22/01/2009 03:42, "Li, Haicheng" <haicheng.li@xxxxxxxxx> wrote:
> >>>>
> >>>>> All,
> >>>>>
> >>>>> We met several system failures on different hardware platforms, which 
> >>>>> are
> >>>>> all
> >>>>> caused by VT-d fault.
> >>>>> err 1: disk is corrupted by VT-d fault on SATA.
> >>>>> err 2: Dom0 kernel panics at booting, which is caused VT-d fault on 
> >>>>> UHCI.
> >>>>> err 3, Dom0 complains disk errors while creating HVM guests.
> >>>>>
> >>>>> The culprit would be changeset 19054 "x86_64: Remove
> >>>>> statically-partitioned
> >>>>> Xen heap.".
> >>>>>
> >>>>> Detailed error logs can be found via BZ#,
> >>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1409.
> >>>>>
> >>>>>
> >>>>> -haicheng
> >>>>> _______________________________________________
> >>>>> Xen-devel mailing list
> >>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >>>>> http://lists.xensource.com/xen-devel
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Xen-devel mailing list
> >>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >>>> http://lists.xensource.com/xen-devel
> >>
> >>
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >> http://lists.xensource.com/xen-devel
> >
> >
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel