Thanks for your comments. However, I got the impression that what I
saw here in this email thread was not in accordance with what I got
when this issue was first submitted.
I am not aware of any context of larger scope of MSI-X cleaning ups if
you are planning to do so. As a result, I might be missing some
important points. So please just go ahead and submit your patches.
Comments are embedded below.
2011/7/12 Jan Beulich <JBeulich@xxxxxxxxxx>:
>>>> On 12.07.11 at 07:24, Haitao Shan <maillists.shan@xxxxxxxxx> wrote:
>> Hi,
>>
>> As reported by Jan, current Qemu does not handle MSIX table mapping
>> properly.
>>
>> Details:
>>
>> MSI-X table resides in one of the physical BARs. When Qemu handles
>> guest's changes to BAR register (within which, MSI-X table resides),
>> Qemu first allows access of the whole BAR MMIO ranges and then removes
>> those of MSI-X. There is a small window here. It is possible that on a
>> SMP guests one vcpu could have access to the physical MSI-X
>> configurations when another vcpu is writing BAR registers.
>>
>> The patch fixes this issue by first producing the valid MMIO ranges by
>> removing MSI-X table's range from the whole BAR mmio range and later
>> passing these ranges to Xen.
>
> That's only half of it - something similar would need to be done for the
> pending bit array.
Please justify why this read-only PBA should also be removed from
guest access. Note that we actually mask physical MSI via MSI-X table
when guests mask it via virtualized MSI-X table.
>
> Further I'm having the impression that while you avoid assigning the
> questionable MMIO range to the guest (which isn't a security concern
> as long as the BAR determination for the device in the hypervisor is
> correct), your patch doesn't prevent qemu actually mapping these
> ranges writably and allow pci_msix_writel() to access it (which is the
> actual open security problem).
I totally disagree. I think Dom0 together with its management SW
entities such as Qemu and libxc/libxl are to be trusted. It can be
arguable to what extent Xen can trust them.
Mapping that to Qemu is mainly for writing MASK bit to physical MSI-X
table directly. Handling guests' masking MSI is already too long a
code path.
If Qemu is not trusted, I would say perhaps you can move MSI
virtualization part from Qemu to Xen itself.
>
> Further, I don't think it's correct to remove guest access to either of
> the two ranges altogether - either qemu needs to emulate access to
> these, or the guest ought to be able to access the ranges directly,
> but read-only.
PBA is exposed to guests, unless it happens to be located on the same
page of MSI-X table (in this case, it have to be removed), per my
understanding.
MSI-X table cannot be exposed to guests even read-only.
Shan Haitao
>
> Jan
>
>> Please have a review, thanks!
>>
>> Signed-off-by: Shan Haitao <haitao.shan@xxxxxxxxx>
>>
>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>> index 9c5620d..b9c2f32 100644
>> --- a/hw/pass-through.c
>> +++ b/hw/pass-through.c
>> @@ -92,6 +92,7 @@
>>
>> #include <unistd.h>
>> #include <sys/ioctl.h>
>> +#include <assert.h>
>>
>> extern int gfx_passthru;
>> int igd_passthru = 0;
>> @@ -1103,6 +1104,7 @@ static void pt_iomem_map(PCIDevice *d, int i,
>> uint32_t e_phys, uint32_t e_size,
>> {
>> struct pt_dev *assigned_device = (struct pt_dev *)d;
>> uint32_t old_ebase = assigned_device->bases[i].e_physbase;
>> + uint32_t msix_last_pfn = 0, bar_last_pfn = 0;
>> int first_map = ( assigned_device->bases[i].e_size == 0 );
>> int ret = 0;
>>
>> @@ -1118,39 +1120,124 @@ static void pt_iomem_map(PCIDevice *d, int i,
>> uint32_t e_phys, uint32_t e_size,
>>
>> if ( !first_map && old_ebase != -1 )
>> {
>> - add_msix_mapping(assigned_device, i);
>> - /* Remove old mapping */
>> - ret = xc_domain_memory_mapping(xc_handle, domid,
>> + if ( has_msix_mapping(assigned_device, i) )
>> + {
>> + msix_last_pfn = (assigned_device->msix->mmio_base_addr - 1 +
>> + assigned_device->msix->total_entries * 16) >>
>> XC_PAGE_SHIFT;
>> + bar_last_pfn = (old_ebase + e_size - 1) >> XC_PAGE_SHIFT;
>> +
>> + if ( assigned_device->msix->table_off )
>> + {
>> + ret = xc_domain_memory_mapping(xc_handle, domid,
>> + old_ebase >> XC_PAGE_SHIFT,
>> + assigned_device->bases[i].access.maddr >> XC_PAGE_SHIFT,
>> + (assigned_device->msix->mmio_base_addr >> XC_PAGE_SHIFT)
>> + - (old_ebase >> XC_PAGE_SHIFT),
>> + DPCI_REMOVE_MAPPING);
>> + if ( ret != 0 )
>> + {
>> + PT_LOG("Error: remove old mapping failed!\n");
>> + return;
>> + }
>> + }
>> + if ( msix_last_pfn != bar_last_pfn )
>> + {
>> + assert(msix_last_pfn < bar_last_pfn);
>> + ret = xc_domain_memory_mapping(xc_handle, domid,
>> + msix_last_pfn + 1,
>> + (assigned_device->bases[i].access.maddr +
>> + assigned_device->msix->table_off +
>> + assigned_device->msix->total_entries * 16 +
>> + XC_PAGE_SIZE -1) >> XC_PAGE_SHIFT,
>> + bar_last_pfn - msix_last_pfn,
>> + DPCI_REMOVE_MAPPING);
>> + if ( ret != 0 )
>> + {
>> + PT_LOG("Error: remove old mapping failed!\n");
>> + return;
>> + }
>> + }
>> + }
>> + else
>> + {
>> + /* Remove old mapping */
>> + ret = xc_domain_memory_mapping(xc_handle, domid,
>> old_ebase >> XC_PAGE_SHIFT,
>> assigned_device->bases[i].access.maddr >> XC_PAGE_SHIFT,
>> (e_size+XC_PAGE_SIZE-1) >> XC_PAGE_SHIFT,
>> DPCI_REMOVE_MAPPING);
>> - if ( ret != 0 )
>> - {
>> - PT_LOG("Error: remove old mapping failed!\n");
>> - return;
>> + if ( ret != 0 )
>> + {
>> + PT_LOG("Error: remove old mapping failed!\n");
>> + return;
>> + }
>> }
>> }
>>
>> /* map only valid guest address */
>> if (e_phys != -1)
>> {
>> - /* Create new mapping */
>> - ret = xc_domain_memory_mapping(xc_handle, domid,
>> + if ( has_msix_mapping(assigned_device, i) )
>> + {
>> + assigned_device->msix->mmio_base_addr =
>> + assigned_device->bases[i].e_physbase
>> + + assigned_device->msix->table_off;
>> +
>> + msix_last_pfn = (assigned_device->msix->mmio_base_addr - 1 +
>> + assigned_device->msix->total_entries * 16) >>
>> XC_PAGE_SHIFT;
>> + bar_last_pfn = (e_phys + e_size - 1) >> XC_PAGE_SHIFT;
>> +
>> +
>> cpu_register_physical_memory(assigned_device->msix->mmio_base_addr,
>> + assigned_device->msix->total_entries * 16,
>> + assigned_device->msix->mmio_index);
>> +
>> + if ( assigned_device->msix->table_off )
>> + {
>> + ret = xc_domain_memory_mapping(xc_handle, domid,
>> + assigned_device->bases[i].e_physbase >> XC_PAGE_SHIFT,
>> + assigned_device->bases[i].access.maddr >> XC_PAGE_SHIFT,
>> + (assigned_device->msix->mmio_base_addr >> XC_PAGE_SHIFT)
>> + - (assigned_device->bases[i].e_physbase >>
>> XC_PAGE_SHIFT),
>> + DPCI_ADD_MAPPING);
>> + if ( ret != 0 )
>> + {
>> + PT_LOG("Error: remove old mapping failed!\n");
>> + return;
>> + }
>> + }
>> + if ( msix_last_pfn != bar_last_pfn )
>> + {
>> + assert(msix_last_pfn < bar_last_pfn);
>> + ret = xc_domain_memory_mapping(xc_handle, domid,
>> + msix_last_pfn + 1,
>> + (assigned_device->bases[i].access.maddr +
>> + assigned_device->msix->table_off +
>> + assigned_device->msix->total_entries * 16 +
>> + XC_PAGE_SIZE -1) >> XC_PAGE_SHIFT,
>> + bar_last_pfn - msix_last_pfn,
>> + DPCI_ADD_MAPPING);
>> + if ( ret != 0 )
>> + {
>> + PT_LOG("Error: remove old mapping failed!\n");
>> + return;
>> + }
>> + }
>> + }
>> + else
>> + {
>> + /* Create new mapping */
>> + ret = xc_domain_memory_mapping(xc_handle, domid,
>> assigned_device->bases[i].e_physbase >> XC_PAGE_SHIFT,
>> assigned_device->bases[i].access.maddr >> XC_PAGE_SHIFT,
>> (e_size+XC_PAGE_SIZE-1) >> XC_PAGE_SHIFT,
>> DPCI_ADD_MAPPING);
>>
>> - if ( ret != 0 )
>> - {
>> - PT_LOG("Error: create new mapping failed!\n");
>> + if ( ret != 0 )
>> + {
>> + PT_LOG("Error: create new mapping failed!\n");
>> + }
>> }
>>
>> - ret = remove_msix_mapping(assigned_device, i);
>> - if ( ret != 0 )
>> - PT_LOG("Error: remove MSI-X mmio mapping failed!\n");
>> -
>> if ( old_ebase != e_phys && old_ebase != -1 )
>> pt_msix_update_remap(assigned_device, i);
>> }
>> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
>> index 71fa6f0..1fbebd4 100644
>> --- a/hw/pt-msi.c
>> +++ b/hw/pt-msi.c
>> @@ -528,39 +528,12 @@ static CPUReadMemoryFunc *pci_msix_read[] = {
>> pci_msix_readl
>> };
>>
>> -int add_msix_mapping(struct pt_dev *dev, int bar_index)
>> +int has_msix_mapping(struct pt_dev *dev, int bar_index)
>> {
>> if ( !(dev->msix && dev->msix->bar_index == bar_index) )
>> return 0;
>>
>> - return xc_domain_memory_mapping(xc_handle, domid,
>> - dev->msix->mmio_base_addr >> XC_PAGE_SHIFT,
>> - (dev->bases[bar_index].access.maddr
>> - + dev->msix->table_off) >> XC_PAGE_SHIFT,
>> - (dev->msix->total_entries * 16
>> - + XC_PAGE_SIZE -1) >> XC_PAGE_SHIFT,
>> - DPCI_ADD_MAPPING);
>> -}
>> -
>> -int remove_msix_mapping(struct pt_dev *dev, int bar_index)
>> -{
>> - if ( !(dev->msix && dev->msix->bar_index == bar_index) )
>> - return 0;
>> -
>> - dev->msix->mmio_base_addr = dev->bases[bar_index].e_physbase
>> - + dev->msix->table_off;
>> -
>> - cpu_register_physical_memory(dev->msix->mmio_base_addr,
>> - dev->msix->total_entries * 16,
>> - dev->msix->mmio_index);
>> -
>> - return xc_domain_memory_mapping(xc_handle, domid,
>> - dev->msix->mmio_base_addr >> XC_PAGE_SHIFT,
>> - (dev->bases[bar_index].access.maddr
>> - + dev->msix->table_off) >> XC_PAGE_SHIFT,
>> - (dev->msix->total_entries * 16
>> - + XC_PAGE_SIZE -1) >> XC_PAGE_SHIFT,
>> - DPCI_REMOVE_MAPPING);
>> + return 1;
>> }
>>
>> int pt_msix_init(struct pt_dev *dev, int pos)
>> diff --git a/hw/pt-msi.h b/hw/pt-msi.h
>> index 9664f89..2dc1720 100644
>> --- a/hw/pt-msi.h
>> +++ b/hw/pt-msi.h
>> @@ -107,10 +107,7 @@ void
>> pt_msix_disable(struct pt_dev *dev);
>>
>> int
>> -remove_msix_mapping(struct pt_dev *dev, int bar_index);
>> -
>> -int
>> -add_msix_mapping(struct pt_dev *dev, int bar_index);
>> +has_msix_mapping(struct pt_dev *dev, int bar_index);
>>
>> int
>> pt_msix_init(struct pt_dev *dev, int pos);
>
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|