Last night I tried a bk pull and noticed some errors which meant nothing was applying, so I was missing a few changesets. I cloned a brand new tree and built a new set of images, and now i'm back to having it spontaneously reboot with no error messages.
d'oh.
james
> It looked like everything was okay except for these messages in DOM0:
>
> (file=main.c, line=270) Failed MMU update transferring to DOM1
> (file=main.c, line=270) Failed MMU update transferring to DOM1
>
> but then I tried to start another domain and got this:
>
> Using config file /etc/xen/mail2
> Error: Internal Server Error
>
> so it looks like something is still wrong somewhere...
Is this with or without the 'better blk dev fix' changeset backed
out?
The "Failed MMU update" messages are very interesting, and I've
never seen them before -- xen is refusing to transfer the page to
dom1 for some reason. Please can you try doing the same with a
debug=y build of Xen. Xen should tell us a bit more about why
it's refusing the request.
What workload are you running when this happens? You seem to have
a real talent for provoking hard to reproduce bugs ;-)
It might be moderately interesting to see the traceback from
xend to see which stage of creating the new domain failed.
Further, once it gets in to this state, it would be good to try
shuting down or destroying the other domains one by one, doing an
'xm list' after each stage. If one of the domains hangs around
after a 'destroy' it's a sign there's been an inconsistency.
Ian