This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] [PATCH] ioemu: passthough: add no_wb option for pci conf

To: Qing He <qing.he@xxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] ioemu: passthough: add no_wb option for pci conf write
From: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
Date: Mon, 9 Nov 2009 17:19:38 +0000
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 09 Nov 2009 09:20:02 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20091109030345.GA16917@ub-qhe2>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20091106091756.GA29376@ub-qhe2> <19188.26651.778508.710707@xxxxxxxxxxxxxxxxxxxxxxxx> <20091109030345.GA16917@ub-qhe2>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Qing He writes ("Re: [Xen-devel] [PATCH] ioemu: passthough: add no_wb option 
for pci conf write"):
> The physical content of MSI data/address is then decided and written by Xen.
> xc_physdev_map_pirq_msi is also used to update guest MSI, including vector
> and affinity.
> Now come to the pt_pci_write_config logic:
>       pci_read_block(&read_val);
>       reg->u.dw.write(read_val, &val);   // the handler
>       pci_write_block(val);
> Since MSI data/address is fully emulated, val always equals to read_val,
> i.e. write what is read back to the register. This would be OK for most of
> the time, however, when the guest changes MSI affinity, something happens
> between read and write. the handler calls xc_physdev_map_pirq_msi to update
> the MSI, hypervisor changes the affinity and write a new vector/affinity
> to the real registers. When the handler returns, pci_write_block(val)
> overwrites the real registers, all the HV changes are lost, making the
> MSI fail.

So as I understand it the problem is as follows, if I may put what you
have said in different words:

 1. Write accesses by the guest to PCI config space are done
    by reading the corresponding real PCI config register to
    get the old value, and passing the old and new values to
    the write handler.

 2. Depending on the write handler, some action will be taken.  For
    an MSI affinity change, this is pt_msgaddr32_reg_write which calls

 3. The write access unconditionally writes the value from the write
    handler into the real device.

The problem is that pt_msi_update will itself actually change the real
affinity in the real PCI config space to a different value.  And then
in step 3 we overwrite that correct value with a wrong one.

It is not convenient for pt_msgaddr32_reg_write to return the correct
value for writing into the real PCI config space (a) because it's
computed by Xen and we don't have the value accessible (b) there might
be races and complicationswith reading it again (c) there might be
races in writing it twice.

The problem occurs because of the previous assumption that guest PCI
config space writes are all write-through, possibly with some
modification to the written value.  The new flag prevents the
write-through (not a write _back_).

In which case I think it's fine if somewhat misnamed.  But it would be
better to consider whether the assumpion is actually valid; perhaps it
would be better for the write handlers to explicitly do the write to
real config space themselves if they need it ?


Xen-devel mailing list