WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] frontend and backend devices and different types of hw -

To: Mark Williamson <mark.williamson@xxxxxxxxxxxx>
Subject: Re: [Xen-devel] frontend and backend devices and different types of hw - pci for example
From: Stefan Berger <stefanb@xxxxxxxxxx>
Date: Tue, 6 Sep 2005 17:59:41 -0400
Cc: Sting Zax <zstingx@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Delivery-date: Tue, 06 Sep 2005 21:57:40 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <200509040423.51526.mark.williamson@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx wrote on 09/03/2005 11:23:50 PM:

> > > Possibly...  I would have been inclined to do it using some sort of
> > > interdomain communication rather than using an emulator in Xen but 
I'm
> > > actually open to persuasion that I'm wrong on this point ;-)
> >
> > I thought about interdomain communication for emulating PCI devices. 
It
> > 'feels' like this would rip apart both  domain's PCI layers quite a 
bit.
> 
> Well...  I think you could hook it into the PCI layer reasonably cleanly 
by 
> using function pointers to ensure that one set of arch functions got 
called 
> for dom0 operation and another set get called for driver domains.

All inb/w/l outb/w/l for the PCI ports 0xcf8-0xcff could also be proxied 
in domain 0 and domain 0 could make only those devices visible to the 
driver domain that are to be used by that domain. I think that doing this 
in the hypervisor through IO port or (partial) PCI bus emulation is a 
"cleaner" way, though.

> 
> Bear in mind we don't necessarily need to interpose the PCI 
virtualisation 
> code on accesses to the device itself, just those to the PCI bridges. 
Access 
> to the device itself can be made directly (although see my comment 
below).
> > through translation in other parts where APIC or ACPI-related code 
plays a
> > role. I haven't had the time to find out what part of the system is 
doing
> > that. Another issue is that the code in /xen/.../io_apic.c is not
> > activated at all in a user/driver domain.
> 
> That sounds like the sort of thing I was expecting but I can't tell you 
where 
> to look off the top of my head...

Actually the IRQ translation happens through a function call like 
acpi_register_gsi() which in turn is called by pci_device_enable(). Not 
sure whether that call can work properly anywhere else than in domain 0.

> 
> > I believe the architecture for access to PCI was different in Xen 1.0
> > where PCI was in the HV. With 2.0 this changed and PCI access was 
moved to
> > domain 0. Did driver domains work with that? If yes, then what was 
done in
> > domain 0 to prepare a domain's access to a device?
> 
> 1.x: PCI core code and device drivers in Xen, only virtual devices in 
guest
> 2.x: PCI core code in Xen, device drivers in domains
> 3.x: PCI core code in dom0, no driver domains yet...
> 
> In the 1.x architecture there just weren't any driver domains, Xen had 
all the 
> drivers (except the graphics driver, which lived in the X server as 
usual). 
> In 2.x the PCI core was in Xen, with guests' PCI layer making hypercalls 

> instead of doing direct hardware access.  Giving a guest a device 
consisted 
> of: allowing it to see *just* that device in config space (reported in 
the 
> hypercall interface), modifying its IO bitmap and memory mapping 
privileges 
> so that it could access these things.
> 
> Since all this is yanked out of Xen now, we don't have quite such 
convenient 
> mechanisms for hiding things from the guest; hence config space access 
needs 
> to be able to go through another channel, or (as you have been working 
> towards) emulated somewhere.

What I have done so far is that I have presented a static config space to 
the PCI layer in user domains. This works well for reading out the config 
space, but not for writing to it. Writes that for example are coming 
through the pci_enable_device() call try to activate the device by writing 
into its config space. Now emulation is unfortunately only good as long as 
the writes are passed through to the device's config space. Once you start 
passing writes through to the real device, you can also pass the reads 
through to it - the emulation might then only be useful for presenting an 
emulated bus entry to the PCI layer. Another requirement for properly 
reading out the config space is to have some locking done between writing 
to IO port 0xcf8 and until reading from 0xcfc. I think a proper solution 
for this would require that all domains (including domain 0) IO port 
accesses in the range of 0xcf8 to 0xcff be intercepted and either 
presented emulated parts (bus) or real devices - no domain would be 
treated differently.

> 
> Had you considered retasking some of the existing "IO packet" stuff as 
used by 
> the Qemu device model to pass requests up to userspace?  Since this is 
only 
> for device discovery the performance hit shouldn't be an issue.  This 
avoids 
> adding code to Xen *and* avoids special-casing in the PCI code.

I had not looked at this so far.
The problem I see in any case is the IRQ number translation that is 
happening. I believe that can only be done properly in domain 0. Somewhere 
in the path of the acpi_register_gsi function call (called from 
pci_enable_device) you have to be able to calculate the IRQ number that is 
actually being used in the system. 

> 
> While I'm on the subject, I'd personally like to see guests granted IO 
access 
> slightly differently.  There are two ways to grant IO access on x86: 
change 
> the IOPL (giving the guest access to all IO ports) or set IO bits in the 
TSS 
> (giving fine grained control).  The problem with the latter is that 
guest 
> *apps* will be able to access the hardware; essentially x86 gives you 
coarse 
> grained control and ring-level protection, or vice-versa.
> 
> Since people often like to partition their systems using Xen, I don't 
really 
> like giving apps easy access to the hardware in this way.  I'd like to 
have 
> the option of trapping IO port writes in Xen and verifying the guest's 
IO 
> privileges in software, then emulating the write.  It is my hope that 
this 
> won't hurt too much on decent hardware (e.g. devices that use an in 
memory 
> buffer descriptor queue) and that on less clever hardware it won't 
matter too 
> much...
> 
> Thoughts?

What I thought that could possibly be done is open up the IO port access 
of domain 0 to all ports at the beginning. A driver domain that learns 
through the PCI config space that it needs access to a range of IO ports 
could have those IO ports opened up as well (maybe automatically by the 
PCI emulation layer) and at the same time those ports could be closed in 
domain 0. Also, if the privilege bitmap in the TSS should not be done in 
HW, one could certainly do it in SW in a 'IO port emulation layer'.  2 
pages for this bitmap would be enough.

  Stefan


> 
> Cheers,
> Mark
> 
> 
> >   Stefan
> >
> > > Cheers,
> > > Mark
> > >
> > > >   Stefan
> > > >
> > > > > > >Note that giving direct physical access to a PCI device has
> >
> > security
> >
> > > > > > >implications since the guest can potentially use the cards' 
DMA
> > > > > > > capabilities to access all of physical memory.
> > > > > >
> > > > > > Will IOMMU support help solving this security problems ?
> > > > >
> > > > > Yes but only if it enforces access permissions fully i.e. I 
don't
> >
> > think
> >
> > > > the
> > > >
> > > > > IOEMU in AMD64 machines is sufficient.  From the looks of 
Pacifica
> >
> > it
> >
> > > > migh -
> > > >
> > > > > have sufficient support to control the DMA problem, I'm sure 
Intel
> >
> > have
> >
> > > > a
> > > >
> > > > > similar solution (although I don't think it's implemented in
> >
> > Vanderpool
> >
> > > > -
> > > >
> > > > > they'll probably need chipset support).
> > > > >
> > > > > Cheers,
> > > > > Mark
> > > > >
> > > > > > Regards,
> > > > > > Sting
> > > > > >
> > > > > > On 8/28/05, Mark Williamson <mark.williamson@xxxxxxxxxxxx> 
wrote:
> > > > > > > > What about other devices ? let's say a PCI sound card (or 
any
> > > >
> > > > other PCI
> > > >
> > > > > > > > device). Where is the software that should handle it ? I
> >
> > remember
> >
> > > > I saw
> > > >
> > > > > > > > somewhere some discussion about PCI configuration space, 
but I
> > > >
> > > > don't
> > > >
> > > > > > > > remember where.
> > > > > > >
> > > > > > > That code is in Xen itself in Xen 2.0.  Xen controls access 
to
> >
> > the
> >
> > > > PCI
> > > >
> > > > > > > configuration spaces so that guests can only see the devices
> >
> > they
> >
> > > > have
> > > >
> > > > > > > access to.  It also controls the IO memory / ports that 
domains
> >
> > are
> >
> > > > > > > allowed to access in order to control PCI devices.
> > > > > > >
> > > > > > > Note that giving direct physical access to a PCI device has
> >
> > security
> >
> > > > > > > implications since the guest can potentially use the cards' 
DMA
> > > > > > > capabilities to access all of physical memory.  The
> >
> > front/back-style
> >
> > > > > > > devices do not have this limitation.
> > > > > > >
> > > > > > > Btw, I've laid some groundwork for a virtual sound device 
but
> > > >
> > > > haven't had
> > > >
> > > > > > > much time to hack on it yet.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Mark
> > > > >
> > > > > _______________________________________________
> > > > > Xen-devel mailing list
> > > > > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > > > > http://lists.xensource.com/xen-devel
> > >
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > > http://lists.xensource.com/xen-devel
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel