On Mar 3, 2007, at 4:57 AM, Tim Post wrote:
Now that I've mentioned, I've had heavy 'disk' dom0 problems using
AoE aswell.
I've been opting for 2x 10G optical ethernet devices in most "serious"
xen farms so I can be really flexible. One becomes a routed IP
network,
the other one is just used for AoE. One good 12 - 24 port switch is
all
I need. Its not cheap, but no longer so expensive that its totally
prohibitive.
We've considered this as well, but for now 4x GigE is working well, 2
for
IP, 3 for AoE. Disk, or vis-a-vis network, congestion isn't the problem,
machines occasionally hanging is the problem.
We believe, largely from the test described previously, that we're
seeing
Dom0 SLAB corruption, and we've tried just about everything to identify
it, but alas, cannot track this down.
We've compiled kernels, Dom0 and DomU with full debugging, etc. but
no luck.
Using that + 15K SAS drives has really solved 99.9% of my problems
with
really demanding guests. But this isn't exactly 'off the shelf'
either.
I think people just somehow lost sight of how Linux uses memory,
and to
avoid needing to learn dom-0 was just named 'taboo'.
Just treat it like any other vital system that has very little ram.
Vital systems having very little ram is not a familiar concept to many
newcomers to Linux since the days of 386 (and gasp) 286 / 8088 / 80886
users.
Agreed. We have plenty of RAM to cat out /proc/slabinfo via cron,
particularly
so if it helps us figure out what is going on. :-)
--
-- Tom Mornini, CTO
-- Engine Yard, Ruby on Rails Hosting
-- Reliability, Ease of Use, Scalability
-- (866) 518-YARD (9273)
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|