|
|
|
|
|
|
|
|
|
|
xen-users
Re: [Xen-users] Strange issue: DomU not saving when using direct HW acce
Hi Mark,
thanks a lot lot for your answer!
Since it is somewhat of a productionbox (all email and web services
running over it) I'll need to restart it some coming night.
I'll definitely keep you updated.
Thx again.
Regards
Falko
Mark Williamson schrieb:
I do have a real strange problem here:
My environment: Xen 3.02 on SuSE 10.1
In dom0 I disable eth0 with the following lines in /etc/init.d/boot.local:
/sbin/modprobe pciback
/bin/echo -n 0000:01:00.0 > /sys/bus/pci/drivers/e1000/unbind
/bin/echo -n 0000:01:00.0 > /sys/bus/pci/drivers/pciback/new_slot
/bin/echo -n 0000:01:00.0 > /sys/bus/pci/drivers/pciback/bind
Than I start my domU with the following parameters:
[...]
pci = [ '01:00.0' ]
dhcp = 'dhcp'
[...]
Basically every thing's fine so far. domU is booting a accessing the net
via dhcp over it's HW assigned eth0.
Cool.
But when I reboot dom0 it tries to save domU (which seems to be OK).
After rebooting dom0 starts to restart domU which fails and results in a
"cold boot" of domU (incl. file check etc on its boot).
Now if I try to save and restore the domU manually it fails and I get
this messages:
Error: pci: Invalid config setting bus: none
Even stranger:
If I then try to start domU manually with xm create domU -c, dhcp is
just not working!
domU finds the assigned HW (eth0) but is not able to set up the network
at all! And I can't get domU back to work until I reboot the whole
system (dom0) completely!
Suspend / resume isn't supported for domains that have direct access to PCI
devices - I'm surprised the tools even allow it (they probably shouldn't!).
It's strange that subsequently starting the domain manually also fails - are
you sure that the domain you attempted to restore wasn't still hanging around
somewhere? If it really is failing when there are no other domains fighting
for that card, it could be that the state of the ethernet card (or, I guess,
maybe that of the Xen PCI pciback driver) has been messed up by the failed
operations and that's why you need a whole machine reboot.
The simple fix is to disable the automatic suspend/resume of that domain on
reboot; have it shutdown and reboot by dom0 instead. Other domains that
don't have direct hardware access may still be safely suspend-resumed.
Something that I'd be interested in is whether once you've got to the wedged
state of requiring a dom0 reboot, whether you could bring up that ethernet
device in dom0 (by rebinding it back to the e1000 driver). This would tell
us if the device is wedged, vs pciback. Please note that trying this (or
starting new driver domains once you've got into the wedged state or doing a
resume of a saved driver domain either explicitly or at dom0 reboot) is quite
possibly going to send weird commands to your NIC; I'd not expect this to
actually harm modern hardware but it's not impossible you could get some
instability / corruption on the host system (not just the domU).
So, if it's *not* an important / production box containing any useful data,
I'd be interested if you could experiment a bit more - otherwise just disable
the automatic suspend/resume on dom0 reboot for that domain and your problem
will be solved.
Does that answer your question? It's great to have users / testers of the
driver domains functionality, so please let us know how you get on!
Cheers,
Mark
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
|
|
|
|