In our Xen
cluster, we have:
- Many DomU hosts
(CentOS 5.2, paravirtualized) mounting a GFS filesystem on a
VBD,
- A few Dom0 hosts
(CentOS 5.2), connected over GigE,
- A single SAN
providing shared block storage for all of the above.
Works great most
of the time. The DomU storage is backed by logical volumes on the Dom0's,
all part of a clustered VG on the SAN.
Once every few
weeks however we experience FS corruption with kernel messages
like:
May 28 09:30:26
r3core-roll03 kernel: GFS: fsid=r3core-inner:wwwdocs.1: fatal: invalid metadata
block May 28 09:30:26 r3core-roll03 kernel: GFS:
fsid=r3core-inner:wwwdocs.1: bh = 27845 (type: exp=4,
found=9) May 28 09:30:26 r3core-roll03 kernel: GFS:
fsid=r3core-inner:wwwdocs.1: function = gfs_get_meta_buffer May
28 09:30:26 r3core-roll03 kernel: GFS: fsid=r3core-inner:wwwdocs.1:
file = /builddir/build/BUILD/gfs-kmod-0.1.23/_kmod_build_xen/src/gfs/dio.c, line
= 1225 May 28 09:30:26 r3core-roll03 kernel: GFS:
fsid=r3core-inner:wwwdocs.1: time = 1243517426 May 28 09:30:27
r3core-roll03 kernel: GFS: fsid=r3core-inner:wwwdocs.1: about to withdraw from
the cluster May 28 09:30:27 r3core-roll03 kernel: GFS:
fsid=r3core-inner:wwwdocs.1: telling LM to withdraw
The remedy is to
shut down nodes accessing the shared FS, fsck and/or mkfs it, then start up
again.
What is puzzling
is the exact cause of the FS corruption. As we try to narrow it down, I've
been forced to closely examine the block layers in Xen. While I don't
fully understand (yet) what blkback is doing, I'm nervous the request queueing
causes blocks to be flushed to disk asynchronously. That could be very bad
for shared filesystems, as I'd expect a file's metadata blocks need to be
written to physical media once a lock is released.
So I'm looking
at blktap now. Most documentation suggests configuring VBDs with
tap:aio:, however my reading of this suggests it can also reorder or defer block
writes, which I'm trying to avoid. It looks like tap:sync: is what I
really need, though very little documentation is available on that specific
driver.
Surely somebody
must have had this problem before, but a couple days of searchinig and reading
have yielded very little. Or am I way off base in understanding the magic
that is GFS and how it guarantees filesystem consistency?
Help
please?
-Jeff
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|