This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Re: APIC rework

To: "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx>
Subject: Re: [Xen-devel] Re: APIC rework
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Wed, 25 Nov 2009 13:00:31 -0500
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Han, Weidong" <weidong.han@xxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Delivery-date: Wed, 25 Nov 2009 10:06:43 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <706158FABBBA044BAD4FE898A02E4BC201CD419316@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <706158FABBBA044BAD4FE898A02E4BC201CD3207E0@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <C72970BC.C323%keir.fraser@xxxxxxxxxxxxx> <706158FABBBA044BAD4FE898A02E4BC201CD3A074E@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <20091124194401.GA29566@xxxxxxxxxxxxxxxxxxx> <706158FABBBA044BAD4FE898A02E4BC201CD3A08EB@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <20091125134144.GA2586@xxxxxxxxxxxxxxxxxxx> <706158FABBBA044BAD4FE898A02E4BC201CD419316@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.19 (2009-01-05)
On Wed, Nov 25, 2009 at 11:21:45PM +0800, Zhang, Xiantao wrote:
> Konrad Rzeszutek Wilk wrote:
> > On Wed, Nov 25, 2009 at 10:43:47AM +0800, Zhang, Xiantao wrote:
> >> Konrad Rzeszutek Wilk wrote:
> >>>> At least dom0 parses this info from DSDT, so we can't have the
> >>>> assuption whether it is used or not, I think. And I also agree to
> >>>> add a new physdev_op to handle this case, and it should be better
> >>>> way to go. Based on this idea, I worked out the patch, attached! 
> >>>> In this patch, we introduced a new physdev_op PHYSDEVOP_setup_gsi
> >>>> for each GSI setup, and each domain can require to map each GSI in
> >>>> this case. In addition, I believe it is very safe to port the
> >>>> hypervisor patch to xen-3.4-x tree and keeps pv_ops dom0 running
> >>>> on it, since no logic is changed.  BTW, I also tested apic and
> >>>> non-apic cases, they works fine after applying the patches.
> >>> 
> >>> But I don't think you tested PCI front and PCI back.
> >>> 
> >>> Mainly these lines worry me (can you inline the patch next time
> >>> too, please): 
> >>> 
> >>> +               map_irq.domid = DOMID_SELF;
> >>> +               map_irq.type = MAP_PIRQ_TYPE_GSI;
> >>> +               map_irq.index = gsi;
> >>> +               map_irq.pirq = irq;
> >>> +               rc = HYPERVISOR_physdev_op(PHYSDEVOP_map_pirq,
> >>> &map_irq); 
> >>> 
> >>> For PCI passthrough to work, the domid needs to be for the guest
> >>> domain, while in this case it is set to Dom0.
> >>> There is already a method of extracting the domain id for PCI
> >>> devices passed to the guest. Look in the 'xen_create_msi_irq'
> >>> function. 
> >> 
> >> Could you detail the concern ?  This hypercall is only related to
> >> GSI, not MSI, why it has side-effect about pci passthrough ? 
> > 
> > This is for PV guests _without_ using QEMU. They are using the PCI
> > backend to "enable" a device (drivers/xen/pciback and
> > drivers/pci/xen-pcifront.c). 
> > The front and back-end communicate the IRQ number (GSI) to the guest
> > when enabling a INTx PCI device (not MSI ones).
> > 
> > Then the PV guest can bind the IRQ (GSI) number to its own event
> > channel and 
> > have fully working PCI device.
> > 
> > With your change, the privileged domain pins the device to itself,
> > not to 
> > other domains.
> But I think dom0 should own the device first during boot, and then assign it 
> to PV guest when this device is required by pcifront?  Basically, we don't 
> know which devices should be reserved for non-previleged domains, right ? So 
> I think the GSI should be initialized and bind to dom0 when dom0 boots?  Once 
> the devices is assigned to PV guests, it maybe need to do the unmapping 
> operation about the GSI from dom0 and do mapping for the domU. 

During boot the device can be owned by pciback or by the modele for which a
PCI entry exist. Look for pciback.hide entry.

There are two modes of execution:
 1). First being what you described wherein the device initially belogs to Dom0.
     The user unbinds it from the PCI device and binds it to the pciback module.
     At that point, the device is disabled and ready for PV guests. When a PV 
     starts pciback module makes the pci_enable_device call and sets the IRQ, 
     for the device (for MSI, it obviously gets the IDT value from the 
 2). Dom0 boots where the user specified on the command line 
     The pciback owns the device (is binded to it) and the native module that 
     load for this PCI device is not called.
It is correct that the unmapping/mapping and the ownership needs to be dynamic. 
As user
could bind to the pciback module, give it to guest, kill the guest, then map 
the PCI device
back to dom0, and after that repeat the whole thing.
> BTW, I just met a strange issue with the  function xen_create_msi_irq you 
> mentioned, and it blocks initialization of SR-IOV devices' VFs , and I think 
> it should be a bug.

Hmm.. I am not sure if this is the appropiate place for it. You see
this driver is designed for the machines that don't have SR-IOV, VFs,
or VT-d to be able to passthrough a PCI device to the PV guest. You can
use this to run PV guests on Pentium 4 style machine.

I think that the SR-IOV devices would go through a different call stack
to enable the device? Either way, I recently got my hands on a SR-IOV machine
and will see how this works.

Has the been working before with SR-IOV cards or is this
your first experiment with this.

> In the function xen_create_msi_irq, there is one line as following to get the 
> domid of the specified device, but a strange domid(0xfff0) is returned by 
> this call, could you help to check whether this strange domid is from?   
> domid = rc = xenbus_walk( "/local/domain/0", get_domid_for_dev, dev);

That is due to casting (domid is short int, rc is int).
You need to check if rc is negative, and if so set the domid to DOMID_SELF.
The next line below does this:

 if (domid <= 0)
                domid = DOMID_SELF;

Ohh wait, looks like the fix for this never got pushed upstream. That
should have been 'if (rc <= 0)'. Yikes.

rc is -16 (EBUSY). That implies that xenstored has not
been initialized on Dom0.

> ^^^^^^^
> Xiantao 

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>