Clearly there's some fairly random memory corruption going on, which
then causes segfaults (if the corruption hits code pages) and
filesystem corruption (if the corruption hits buffer-cache pages).
The "Bailing: not a -ve offset" and "GPF (0004):" messages are almost
certainly just symptoms of executing a corrupted block of code. i.e.,
the bug has already triggered some time ago - probably corrupted a
page of glibc or the kernel.
It would be interesting to see whether or not this is SMP-related.
It's also interesting that someone said they couldn't reproduce
corruption when using 2.6.7 for the non-privileged guest OSes.
-- Keir
> that sounds like the same sort of errors i'm getting which appeared to be
> filesystem corruption. First the corruption starts, then everything you do
> causes a segfault, although i've only seen funny things happen in dom0.
>
> In the limited testing i've done it looks like dom0 by itself is stable, but
> crashes start occuring once I start up other domains and work dom0 hard
> (other domains running under light load). I'm running this script in dom0:
>
> #!/bin/sh
> while [ 1 = 1 ]
> do
> diff file3 file4 && echo okay
> done
>
> where file3 and file4 are around 300mb files, and the vm has 128mb of memory
> with no swap. This ensures that none of the file is cached so there's lots of
> I/O.
>
> When i've seen it crash most readily has been when i'm running a few other
> domains and then start running dom0 out of memory, but nothing conclusive yet.
>
> I'll let this test keep running for another hour (otherwise idle, no other
> domains running) or so then start my running-out-of-memory program.
>
> I wonder if it is coincidence that we both have smp boxes... each of the
> domains only sees 1 cpu so I wouldn't have thought that would be a problem
> unless there's a race in xen itself.
>
> James
>
>
> From: Derek Glidden
> Sent: Mon 19/07/2004 3:22 PM
> To: xen-devel@xxxxxxxxxxxxxxxxxxxxx
> Subject: [Xen-devel] segfault in VM
>
>
> Maybe related or maybe not, but it was the same VM getting all the
> scheduling time in my previous post. (SMP Celeron box with 512M of
> RAM, no himem enabled.)
>
> At the time, four VMs were all compiling, with dom0 copying a linux
> source tree from one place to another with rsync. Everything copacetic
> until I started the big rsync in dom0, where within a minute or so, vm2
> bombed. No messages on the dom0 console or in the VM other than the
> "Segmentation Fault" in the VM during compliation.
>
> However XEN (compiled with debug=y) console spits out:
>
> (XEN) (file=x86_32/emulate.c, line=228) Bailing: not a -ve offset into
> 4GB segment.
>
> at the time of the segmentation fault.
>
> (and there are lots of these, pretty much any time there is heavy i/o
> on the machine, all with the same values:)
>
> (XEN) (file=traps.c, line=466) GPF (0004): fc5277a8 -> fc52a294
>
> Any further activity inside vm2 results in more segmentation faults and
> more "Bailing" messages. The other VMs and dom0 seem to be ok.
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> "We all enter this world in the | Support Electronic Freedom
> same way: naked; screaming; soaked | http://www.eff.org/
> in blood. But if you live your | http://www.anti-dmca.org/
> life right, that kind of thing |---------------------------
> doesn't have to stop there." -- Dana Gould
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by BEA Weblogic Workshop
> FREE Java Enterprise J2EE developer tools!
> Get your free copy of BEA WebLogic Workshop 8.1 today.
> http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/xen-devel
-=- MIME -=-
--_DA10D165-B49A-46A6-8E62-3E81282C36E8_
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset="iso-8859-1";
format=flowed
that sounds like the same sort of errors i'm getting which appeared to be f=
ilesystem corruption. First the corruption starts, then everything you do c=
auses a segfault, although i've only seen funny things happen in dom0.
In the limited testing i've done it looks like dom0 by itself is stable, bu=
t crashes start occuring once I start up other domains and work dom0 hard (=
other domains running under light load). I'm running this script in dom0:
#!/bin/sh
while [ 1 =3D 1 ]
do
diff file3 file4 && echo okay
done
where file3 and file4 are around 300mb files, and the vm has 128mb of memor=
y with no swap. This ensures that none of the file is cached so there's lot=
s of I/O.
When i've seen it crash most readily has been when i'm running a few other =
domains and then start running dom0 out of memory, but nothing conclusive y=
et.
I'll let this test keep running for another hour (otherwise idle, no other =
domains running) or so then start my running-out-of-memory program.
I wonder if it is coincidence that we both have smp boxes... each of the do=
mains only sees 1 cpu so I wouldn't have thought that would be a problem un=
less there's a race in xen itself.
James
From: Derek Glidden
Sent: Mon 19/07/2004 3:22 PM
To: xen-devel@xxxxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] segfault in VM
Maybe related or maybe not, but it was the same VM getting all the=20
scheduling time in my previous post. (SMP Celeron box with 512M of=20
RAM, no himem enabled.)
At the time, four VMs were all compiling, with dom0 copying a linux=20
source tree from one place to another with rsync. Everything copacetic=20
until I started the big rsync in dom0, where within a minute or so, vm2=20
bombed. No messages on the dom0 console or in the VM other than the=20
"Segmentation Fault" in the VM during compliation.
However XEN (compiled with debug=3Dy) console spits out:
(XEN) (file=3Dx86_32/emulate.c, line=3D228) Bailing: not a -ve offset into=
=20
4GB segment.
at the time of the segmentation fault.
(and there are lots of these, pretty much any time there is heavy i/o=20
on the machine, all with the same values:)
(XEN) (file=3Dtraps.c, line=3D466) GPF (0004): fc5277a8 -> fc52a294
Any further activity inside vm2 results in more segmentation faults and=20
more "Bailing" messages. The other VMs and dom0 seem to be ok.
-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=
=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-
"We all enter this world in the | Support Electronic Freedom
same way: naked; screaming; soaked | http://www.eff.org/
in blood. But if you live your | http://www.anti-dmca.org/
life right, that kind of thing |---------------------------
doesn't have to stop there." -- Dana Gould
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=3D4721&alloc_id=3D10040&op=3Dclick
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel
--_DA10D165-B49A-46A6-8E62-3E81282C36E8_
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<HTML><HEAD></HEAD>
<BODY>
<DIV id=3DidOWAReplyText53940 dir=3Dltr>
<DIV dir=3Dltr><FONT face=3DArial color=3D#000000 size=3D2>that sounds like=
the same sort of errors i'm getting which appeared to be filesystem corrup=
tion. First the corruption starts, then everything you do causes a segfault=
, although i've only seen funny things happen in dom0.</FONT></DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2>In the limited testing i've done=
it looks like dom0 by itself is stable, but crashes start occuring once I =
start up other domains and work dom0 hard (other domains running under ligh=
t load). I'm running this script in dom0:</FONT></DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2>#!/bin/sh<BR>while [ 1 =3D 1 ]<B=
R>do<BR> diff file3 file4 && echo okay<BR>done<BR></FONT></DIV=
>
<DIV dir=3Dltr><FONT face=3DArial size=3D2>where file3 and file4 are around=
300mb files, and the vm has 128mb of memory with no swap. This ensures tha=
t none of the file is cached so there's lots of I/O.</FONT></DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2>When i've seen it crash most rea=
dily has been when i'm running a few other domains and then start running d=
om0 out of memory, but nothing conclusive yet.</FONT></DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2>I'll let this test keep running =
for another hour (otherwise idle, no other domains running) or so then star=
t my running-out-of-memory program.</FONT></DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2>I wonder if it is coincidence th=
at we both have smp boxes... each of the domains only sees 1 cpu so I would=
n't have thought that would be a problem unless there's a race in xen itsel=
f.</FONT></DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT><FONT face=3DArial size=
=3D2></FONT> </DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2>James</FONT></DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV dir=3Dltr> </DIV>
<DIV dir=3Dltr><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV dir=3Dltr> </DIV></DIV>
<DIV dir=3Dltr><BR>
<HR tabIndex=3D-1>
<FONT face=3DTahoma size=3D2><B>From:</B> Derek Glidden<BR><B>Sent:</B> Mon=
19/07/2004 3:22 PM<BR><B>To:</B> xen-devel@xxxxxxxxxxxxxxxxxxxxx<BR><B>Sub=
ject:</B> [Xen-devel] segfault in VM<BR></FONT><BR></DIV>
<DIV><PRE style=3D"WORD-WRAP: break-word">Maybe related or maybe not, but i=
t was the same VM getting all the=20
scheduling time in my previous post. (SMP Celeron box with 512M of=20
RAM, no himem enabled.)
At the time, four VMs were all compiling, with dom0 copying a linux=20
source tree from one place to another with rsync. Everything copacetic=20
until I started the big rsync in dom0, where within a minute or so, vm2=20
bombed. No messages on the dom0 console or in the VM other than the=20
"Segmentation Fault" in the VM during compliation.
However XEN (compiled with debug=3Dy) console spits out:
(XEN) (file=3Dx86_32/emulate.c, line=3D228) Bailing: not a -ve offset into=
=20
4GB segment.
at the time of the segmentation fault.
(and there are lots of these, pretty much any time there is heavy i/o=20
on the machine, all with the same values:)
(XEN) (file=3Dtraps.c, line=3D466) GPF (0004): fc5277a8 -> fc52a294
Any further activity inside vm2 results in more segmentation faults and=20
more "Bailing" messages. The other VMs and dom0 seem to be ok.
-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=
=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-
"We all enter this world in the | Support Electronic Freedom
same way: naked; screaming; soaked | http://www.eff.org/
in blood. But if you live your | http://www.anti-dmca.org/
life right, that kind of thing |---------------------------
doesn't have to stop there." -- Dana Gould
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=3D4721&alloc_id=3D10040&op=3Dclick
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel
</PRE></DIV></BODY></HTML>
--_DA10D165-B49A-46A6-8E62-3E81282C36E8_--
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel
|