xen-devel-bounces@xxxxxxxxxxxxxxxxxxx <> wrote:
> [Yunhong Jiang]
>> Espen Skoglund <mailto:espen.skoglund@xxxxxxxxxxxxx> wrote:
>>> [Kevin Tian]
>>>>> From: Espen Skoglund
>>>>> Sent: Wednesday, December 10, 2008 7:34 PM
>>>>>
>>>>> [Yunhong Jiang]
>>>>>> This patch try to do some cleanup for these issues.
>>>>>> 1) The basic idea is to remove the pci_dev's lock, instead, we
>>>>>> try to use the big pcidevs_lock to protect the whole pci_dev
>>>>>> stuff. It including both pci_dev adding/removing, and also
>>>>>> the assignment of the devices. We checked the code and seems
>>>>>> there is no critical code path for this. We try to use
>>>>>> fine-grained lock and seems the code will be very tricky.
>>>>>> 2) Split the pci_enable_msi into two step, firstly it will just
>>>>>> construct the msi_desc through pci_enable_msi without holding
>>>>>> the irq_desc lock, and then it will setup msi through
>>>>>> setup_msi_irq with irq_desc holded.
>>>>>> 3) Change the iommu->lock and hd->mapping_lock to be irq_save.
>>>>>> 4) Fix to some minor issues.
>>>>>
>>>>>> Now the lock sequence is: pcidevs_lock -> domai's event_lock ->
>>>>>> iommu's lock -> hvm_iommu's mapping_lock. The irq_desc's lock
>>>>>> will always be the last lock be hold for peformance
>>>>>> consideration.
>>>>>
>>>>> So what exactly is it that pcidevs_lock is supposed to "protect"
>>>>> now? Does it indicate that someone is holding a reference to a
>>>>> pci_dev? Does it indicate that someone will modify some pci_dev?
>>>>> Does it indicate that someone is messing around with interrupts
>>>>> or MSI descriptors?
>>>
>>>> I think it protects all above. As those operations are all rare,
>>>> such a big lock can avoid complex lock/unlock sequence regarding
>>>> to different paths to different resource of an assigned device.
>>>
>>> Except, since it is used as a read_lock most of the time it does
>>> not actually protect things in the way a spinlock would do.
>
>> Ahh, yes, I didn't realize this. So how do you think changing it to
>> spinlock, since it is not performance ciritical. Or do you know how
>> much benifit for read_lock?
>
> It may not be performance critical, but I don't particularly like
> seeing big global locks introduced. By the looks of it this lock
> would also protect everything related to interrupt management.
The interrupt should be protected by irq_desc's lock.
The content will be used by interrupt including the mask register and the
message (data/address) registers, which should always be protected by
irq_desc, others will not be used by ISR, so is interrupt safe and can be
protected by pcidevs_lock.
Or you noticed anything wrong in the code that violate the rule?
>
>
>> Also, as for the reference to pci_dev, do you have plan to add such
>> support? For example, I'm not sure if we can return fail for
>> pci_remove_device if there is still reference to it? Will dom0
>> support such failure?
>
> The reason I opted for a pdev->spinlock rather than refcount was that
> removing a pdev with a refcount is kind of messy. The pdev is not an
> anonymous struct that can be released by, e.g., an RCU like mechanism.
> It is intrinsically tied with a BDF, and we can not create a new pdev
> for that BDF until the previous pdev struct has been completely purged.
>
> The idea with pcidevs_lock was to only protect the pci_dev lists
> itself. The lock order of pcidevs_lock and pdev->lock only matters
> when write_lock is involved. So, the follwoing two combinations are fine:
>
> read_lock(pcidevs_lock) => spin_lock(pdev)
> spin_lock(pdev) => read_lock(pcidevs_lock)
>
> The following combinations are not:
>
> read_lock(pcidevs_lock) => spin_lock(pdev)
> spin_lock(pdev) => write_lock(pcidevs_lock)
>
> There is only one place in Xen where the last case occurs (the
> location you identified in reassing_device_ownership). However, this
> code can easily be fixed so that it is deadlock free.
As you stated in another mail, more deadlock may exists for pcidevs_lock.
>
> The idea behind pdev->lock was to protect agains concurrent updates
> and usages relating to a particular pdev --- including assignment to
> domains, IOMMU assignment, and MSI assignment. My guess is that where
> things broke down was where the MSI related pdev->locking interferred
> with the other interrupt related locking. I don't have a clear enough
> picture of inetrrupt related locks to pinpoint the problem more exectly.
I suspect it is more about MSI related pdev->locking. I'd list how information
protected currently:
1) device list, i.e. the alldevs_list, wihch is portected currently by
pcidevs_lock;
2) device assignment, i.e. pci_dev->lock, which is protected by pci_dev->lock,
if I understand correctly
3) IOMMU assignment, i.e. domain->arch.pdev_list, which is protected by
pcidevs_lock currently
4) MSI assignment, i.e. pci_dev->msi_list, which is protected by pci_dev->lock.
I think the issue is on item 3/4.
For Item 3, because the iommu assignment is protected by a pcidevs_lock, it
will cause reassign_device_ownership deadlock. If you do want fine grained
lock, maybe you can consider per-domain lock like domain->arch.pdev_lock :-)
For item 4, I think the reason the issue comes on multiple vector MSI support.
Thanks
Yunhong Jiang
>
>
> eSk
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|