Hi,
Is your 32bit pae domU paravirt or hvm? We have seen similar ext3
corruptions on rhel3 and rhel4 32pae hvm guests, one which appeared to be
triggered by a shadow optimization for pae.
thanks
kurt
On Mon, Jun 23, 2008 at 12:15:33PM -0400, Christopher S. Aker wrote:
> We've been seeing a rash of ext3 directory corruption occurring under Xen.
> All but one of the reports have been with filesystems formatted with 1024
> blocksize. We have one report, that's potentialy the same bug, occurring
> on a filesystem with 4096 blocksize (either way, it was some type of
> corruption in that case). In all cases, the filesystems were mounted with
> ext3's default journaling mode. No quotas or anything else other than the
> default ext3 mount options.
>
> It's happened on a number of different hosts, all of the same hardware and
> software configuration (Xen 3.2 64bit, 32bit pae dom0, 32bit pae domUs.
> LVM backend with 3ware hardware RAID-1). Some of those hosts were
> previously running non-virtaulized Linux and UML, using the identical guest
> images, and under that configuration never experienced this problem.
>
> This has occurred under both 2.6.18-xenbits and the more recent pv_ops
> based kernels (2.6.24, 2.6.25), which I presume are all using the same
> blkfront driver code.
>
> The common workloads from the reports seems to be active maildirs and
> rsync.
>
> The initial errors reported back are all from fs/ext3/dir.c, in
> ext3_check_dir_entry(). Most commonly hit is the "rec_len % 4 != 0" check.
> We've seen other checks trigger, but my assumption is that those happen
> after more stuff gets whacked out.
>
> Eventually the fs will go read-only. In extreme cases, the fs is chewed
> through enough that data is lost.
>
> It's tricky to track down the trigger because you can only detect the
> corruption after it's happened. Our attempts to reproduce this using
> various filesystem thrashing scripts haven't yielded a reliable way to
> trigger it, however we have been successful in triggering it twice -- in
> two weeks :( .
>
> My hope is that this triggers an "a-hah" from someone in LKML or Xen land
> who has experience with this code, or that this is a known issue and a fix
> already lives.
>
> We're scared. Please help.
>
> Thanks,
> -Chris
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
--
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|