WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] Re-enable MSI support

To: Espen Skoglund <espen.skoglund@xxxxxxxxxxxxx>
Subject: RE: [Xen-devel] [PATCH] Re-enable MSI support
From: "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Date: Thu, 11 Dec 2008 11:18:17 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "wei.wang2@xxxxxxx" <wei.wang2@xxxxxxx>, "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Wed, 10 Dec 2008 19:18:46 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <18751.62957.797341.541925@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <E2263E4A5B2284449EEBD0AAB751098401C31CA5E9@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <18751.43300.900610.363039@xxxxxxxxxxxxxxxxxx> <0A882F4D99BBF6449D58E61AAFD7EDD601E23C88@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <18751.56590.620965.388937@xxxxxxxxxxxxxxxxxx> <E2263E4A5B2284449EEBD0AAB751098401C31CAD86@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <18751.62957.797341.541925@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acla6St0565cRP0pSOW+njevaovIuQAUYFVA
Thread-topic: [Xen-devel] [PATCH] Re-enable MSI support
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx <> wrote:
> [Yunhong Jiang]
>> Espen Skoglund <mailto:espen.skoglund@xxxxxxxxxxxxx> wrote:
>>> [Kevin Tian]
>>>>> From: Espen Skoglund
>>>>> Sent: Wednesday, December 10, 2008 7:34 PM
>>>>> 
>>>>> [Yunhong Jiang]
>>>>>> This patch try to do some cleanup for these issues.
>>>>>> 1) The basic idea is to remove the pci_dev's lock, instead, we
>>>>>>    try to use the big pcidevs_lock to protect the whole pci_dev
>>>>>>    stuff. It including both pci_dev adding/removing, and also
>>>>>>    the assignment of the devices. We checked the code and seems
>>>>>>    there is no critical code path for this. We try to use
>>>>>>    fine-grained lock and seems the code will be very tricky.
>>>>>> 2) Split the pci_enable_msi into two step, firstly it will just
>>>>>>    construct the msi_desc through pci_enable_msi without holding
>>>>>>    the irq_desc lock, and then it will setup msi through
>>>>>>    setup_msi_irq with irq_desc holded.
>>>>>> 3) Change the iommu->lock and hd->mapping_lock to be irq_save.
>>>>>> 4) Fix to some minor issues.
>>>>> 
>>>>>> Now the lock sequence is: pcidevs_lock -> domai's event_lock ->
>>>>>> iommu's lock -> hvm_iommu's mapping_lock. The irq_desc's lock
>>>>>> will always be the last lock be hold for peformance
>>>>>> consideration.
>>>>> 
>>>>> So what exactly is it that pcidevs_lock is supposed to "protect"
>>>>> now?  Does it indicate that someone is holding a reference to a
>>>>> pci_dev?  Does it indicate that someone will modify some pci_dev?
>>>>> Does it indicate that someone is messing around with interrupts
>>>>> or MSI descriptors?
>>> 
>>>> I think it protects all above. As those operations are all rare,
>>>> such a big lock can avoid complex lock/unlock sequence regarding
>>>> to different paths to different resource of an assigned device.
>>> 
>>> Except, since it is used as a read_lock most of the time it does
>>> not actually protect things in the way a spinlock would do.
> 
>> Ahh, yes, I didn't realize this. So how do you think changing it to
>> spinlock, since it is not performance ciritical. Or do you know how
>> much benifit for read_lock?
> 
> It may not be performance critical, but I don't particularly like
> seeing big global locks introduced.  By the looks of it this lock
> would also protect everything related to interrupt management.

The interrupt should be protected by irq_desc's lock.
The content will be used by interrupt including the mask register and the 
message (data/address) registers, which should always be protected by  
irq_desc, others will not be used by ISR, so is interrupt safe and can be 
protected by pcidevs_lock.

Or you noticed anything wrong in the code that violate the rule?

> 
> 
>> Also, as for the reference to pci_dev, do you have plan to add such
>> support? For example, I'm not sure if we can return fail for
>> pci_remove_device if there is still reference to it? Will dom0
>> support such failure?
> 
> The reason I opted for a pdev->spinlock rather than refcount was that
> removing a pdev with a refcount is kind of messy.  The pdev is not an
> anonymous struct that can be released by, e.g., an RCU like mechanism.
> It is intrinsically tied with a BDF, and we can not create a new pdev
> for that BDF until the previous pdev struct has been completely purged.
> 
> The idea with pcidevs_lock was to only protect the pci_dev lists
> itself.  The lock order of pcidevs_lock and pdev->lock only matters
> when write_lock is involved.  So, the follwoing two combinations are fine:
> 
>    read_lock(pcidevs_lock) => spin_lock(pdev)
>    spin_lock(pdev) => read_lock(pcidevs_lock)
> 
> The following combinations are not:
> 
>    read_lock(pcidevs_lock) => spin_lock(pdev)
>    spin_lock(pdev) => write_lock(pcidevs_lock)
> 
> There is only one place in Xen where the last case occurs (the
> location you identified in reassing_device_ownership).  However, this
> code can easily be fixed so that it is deadlock free.

As you stated in another mail, more deadlock may exists for pcidevs_lock.

> 
> The idea behind pdev->lock was to protect agains concurrent updates
> and usages relating to a particular pdev --- including assignment to
> domains, IOMMU assignment, and MSI assignment.  My guess is that where
> things broke down was where the MSI related pdev->locking interferred
> with the other interrupt related locking.  I don't have a clear enough
> picture of inetrrupt related locks to pinpoint the problem more exectly.

I suspect it is more about MSI related pdev->locking. I'd list how information 
protected currently:
1) device list, i.e. the alldevs_list, wihch is portected currently by 
pcidevs_lock;
2) device assignment, i.e. pci_dev->lock, which is protected by pci_dev->lock, 
if I understand correctly
3) IOMMU assignment, i.e. domain->arch.pdev_list, which is protected by 
pcidevs_lock currently
4) MSI assignment, i.e. pci_dev->msi_list, which is protected by pci_dev->lock.

I think the issue is on item 3/4.
For Item 3, because the iommu assignment is protected by a pcidevs_lock, it 
will cause reassign_device_ownership deadlock. If you do want fine grained 
lock, maybe you can consider per-domain lock like domain->arch.pdev_lock :-)
For item 4, I think the reason the issue comes on multiple vector MSI support.

Thanks
Yunhong Jiang

> 
> 
>       eSk
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel