Scott Garron wrote:
dom0 console and HVM domUs would periodically hang for several
seconds and then return as if nothing was wrong. [.snip.] I ended
up fixing it by unsetting CONFIG_NO_HZ in the kernel
Jeremy Fitzhardinge wrote:
What kernel is this? This sounds like a symptom of the sched_clock
problem I fixed a few weeks ago.
2.6.32.18
ref: refs/heads/xen/stable-2.6.32.x
git log shows this as the most recent commit (from Aug 30):
commit 2968b258b1ca6bd16d758dd68900669419caff2b
It could just be slightly different architecture or the fact that
the machine has overall less RAM (4G instead of 8G).
What happens if you boot that system with "mem=4G"
I managed to finally be able to try this last night, and it didn't seem
to make any difference. It did seem to last a bit longer (I had it
creating and removing snapshots every 6 seconds while the backup process
was also creating and removing them as needed, and it went along for
about 20 minutes before becoming unstable). The OOPS message was
different than last time, but similar to the first one I sent when
reporting this.
After it crashed, I also went ahead and flashed the BIOS to the latest
version, to see if it made any difference. After flashing, I booted
normally (without mem=4G), and got it to crash again - this time with a
similar OOPS message to the one I sent to you in my previous e-mail.
The new BIOS didn't help, obviously. I've appended the ps -eH
-owchan,nwchan,cmd outputs and kernel OOPS messages from last night to
the end of the text file at:
http://www.pridelands.org/~simba/hurricane-server.txt
udevd: worker did not accept message -1 (Connection refused) kill
Are they atypical?
I don't recall seeing them before, but after flashing the BIOS, they are
no longer occurring.
This post seems to be eerily similar to the problem I'm
experiencing. http:[...]xen-devel/2010-09/msg00169.html
Aside from udev being involved, the symptom looks quite different.
I suppose that's true, but he mentions in this post:
http://lists.xensource.com/archives/html/xen-devel/2010-09/msg00286.html
that lvcreate and udev are hanging while creating a snapshot volume.
That's the reason I thought it was similar. (That, and he seems to do
backups in a similar way that I do: creates a snapshot, makes a copy of
the snapshot [although, he block-attaches the volume to a domU to do it
whereas I just use dom0], then removes the snapshot.)
--
Scott Garron
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|