[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] PCI BAR register space written with garbage in HVM guest.


  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: Dan Gora <dan.gora@xxxxxxxxx>
  • Date: Mon, 15 Mar 2010 22:09:28 -0300
  • Delivery-date: Mon, 15 Mar 2010 18:10:25 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=L7z3N6hARGTs/6NHAEPh0BP+W1TTjZ6YK6q6C2GeN0x1n7WHvCy40qsApjO8DSp7zC 4rU3njenIt/Wf5rLTnCiaEqt03/CPJcVur+cb+ZKyPjakAHyEkJTP/d4W9oPBqRq8uqO syO4Y20vmVObEGI4rIgb0HreIUxaepKOq9LS0=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Hi All,

I'm having a problem where if I pass through two instances of my
device to a HVM domU, one of the board instances is having it's PCI
BAR registers overwritten with garbage by some unknown actor 30
seconds to a minute after I load my driver.  I cannnot for the life of
me find what might possibly be overwriting the BAR registers.

I've added a debugging printf to XEN in
xen/arch/x86/pci.c:pci_conf_write() and I can see the entire PCI BAR
address space being overwritten with garbage:

(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080000 offset=0x0
bytes=4 value=0xffffffff
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080004 offset=0x0
bytes=4 value=0x1600ffff
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080008 offset=0x0
bytes=4 value=0x64d5323e
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008000c offset=0x0
bytes=4 value=0x450008
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080010 offset=0x0
bytes=4 value=0xa7e54002
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080014 offset=0x0
bytes=4 value=0x11400000
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080018 offset=0x0
bytes=4 value=0x693
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008001c offset=0x0
bytes=4 value=0xffff0000
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080020 offset=0x0
bytes=4 value=0x4400ffff
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080024 offset=0x0
bytes=4 value=0x2c024300
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080028 offset=0x0
bytes=4 value=0x1012dac
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008002c offset=0x0
bytes=4 value=0xa1c30006
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080030 offset=0x0
bytes=4 value=0xa00040d
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080034 offset=0x0
bytes=4 value=0x0
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080038 offset=0x0
bytes=4 value=0x0
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008003c offset=0x0
bytes=4 value=0x0
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080040 offset=0x0
bytes=4 value=0x0
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080044 offset=0x0
bytes=4 value=0x16000000
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080048 offset=0x0
bytes=4 value=0x64d5323e
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x8008004c offset=0x0
bytes=4 value=0x0
(XEN) xen/arch/x86/pci.c: pci_conf_write: cf8=0x80080050 offset=0x0
bytes=4 value=0x0
<snipped, rest of PCI BAR registers written with 0x0...>

I've added printks to the dom0 and domU kernels in the
pci_bus_write_config_##size() macros in drivers/pci/access.c and in
arch/x86/pci/direct.c to print every time the kernel accesses PCI
configuration space, but I only see these printfs when my driver
access my board's PCI configuration space or some other driver
accesses PCI configuration space, but I do NOT see them when this PCI
BAR register space trashing happens.

So I noticed also that lspci does not cause these kernel printfs to
occur and upon reading the pciutils source code I learned that pretty
much anything which can do an outl() to 0xcf8/0xcfc can mess with PCI
configuration space.

So now I figure it must be some user space thing unless I'm just
missing some other way which the kernel or XEN can access PCI
configuration space, but what could it possibly be?

This problem only occurs in HVM guests and only seems to occur when I
pass two instances of my device to the domU and only occurs many many
seconds after I load my driver (30-60 seconds).  I'm absolutely sure
that it's not my driver because the kernel printfs show up when my
driver accesses PCI configuration space.

I'm really pretty much at a loss as even how to debug this.  There
doesn't appear to be any dump_stack() in XEN so that I can see what
called pci_conf_write() in XEN, but even then it appears that it only
gets called as a trap from the dom0 or domU.  It's not clear to me if
you can even see what process/stack actually caused the trap back in
the dom0 or domU.  Is that possible?

Is there anything else that I should look at?  qemu?  pciback?
pcifront?  Am I missing some access method to PCI configuration space
down in the kernel or is pci_confl_read/write pretty much it?  Any
ideas what would possibly be trying to overwrite all of PCI
configuration space like this?

_any_ ideas are most welcome..

thanks
dan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.