On Fri, Apr 30, 2010 at 09:32:37AM -0700, dwight at supercomputer.org wrote:
> Is anyone else running the latest XCP on HP ProLiant DL380
> systems? Or a similar dual Xeon 8-core system? I'm seeing
> spontaneous reboots when under a load.
>
> Specifically, when 4 Windows HVMs are loaded, I haven't noticed
> any reboots yet. But when running 7 or 8, the system will
> reboot within minutes. Very little information appears on
> the console.
>
> I built a debugging version of the hypervisor, which changed
> the behavior; the system managed to stay up for 2-3 hours
> with 7 VMs running. However, it again spontaneously rebooted,
> with no real messages on the console as to why.
>
> I can send out the console log messages this evening, along
> with the system information if there's interest. Alas, I
> don't have access to these items at the moment.
>
> I have also been running memtest86 overnight. As of 1.5 hours into
> the test, there were no errors. But there are 48 GB of RAM
> on the system, so the testing wasn't complete when I left.
>
> Any suggestions here? I was going to build a 32-bit kernel
> from the latest patches, but it appears Centos 5.4 Xen is
> also not stable on these systems. I had trouble getting
> the kernel to build here, with various errors. The most
> notable of which was:
>
> ----------------------
> CC arch/x86/kernel/acpi/processor.o
> In file included from arch/x86/kernel/acpi/processor.c:8:
> include/linux/kernel.h:185: internal compiler error: Segmentation
> fault
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <http://bugzilla.redhat.com/bugzilla> for instructions.
> The bug is not reproducible, so it is likely a hardware or OS
> problem.
> make[2]: *** [arch/x86/kernel/acpi/processor.o] Error 1
> make[1]: *** [arch/x86/kernel/acpi] Error 2
> make: *** [arch/x86/kernel] Error 2
> ----------------------
>
Uhm.. the compiler really shouldn't crash.
Are you sure your hardware is OK? If the stock EL5.4 Xen also crashes,
it could be broken hardware?
Did you try running memtest86+ ?
Is baremetal Linux stable, if you run for example
"make -j8 bzImage && make -j8 modules && make clean" kernel build in a loop?
-- Pasi
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|