hello,
I experienced and sometimes still experience the same. But I think I found a
reason...or at least one cause of possibly more.
If a domu uses swap and swaps out too much and theres much io on the backend
(tap, file or isci in my configs) the concerned xenstrored consumes the ram
for writing to the backend..which starts lagging ugly.
So, for the buggy domains I first suggest:
a) check available ram/check for swapping
b) for testing purposes disable swap
c) I think its most horrible on tap:aio, and normal loop-diskimages. Try to
switch the backend if you use one of the mentioned.
Meanwhile I started over with no swap for my domus at all, and if RAM is
needed (oom killer acts in vm ) I balloon it in, or check for the reason - for
me this is a possible way and is a good workaround.
My Dom0s are not up-to-date, perhaps the behaviour is "fixed"/different in
newer versions than mine (at least 4.0 showes this from time to time).
But, as I say this, it may be something totally different on your side.
Cheers,
Holger
Am Donnerstag, 10. November 2011 13:56:03 schrieb Adrien Urban:
> Hello,
>
> I work in a hosting company, we have tens of Xen dom0 running just fine,
> but unfortunately we do have a few that get out of control.
>
> Reported behaviour :
> - dom0 uses more and more memory
> - no process can be found using that memory
> - at some point, oom killer kicks in, and kills everything, until even
> ssh the box becomes hard
> - when there is really no more process to kill, it crashes even more,
> and we are forced to reboot
>
> Configuration summary :
> - dom0 with debian/stable, xen 4.0.1
> - 512MB, or up to 2GB after some crash
>
>
> I have tryed to find something that differs between a working dom0 and a
> buggy one, but didn't manage to find anything. Install from the same
> template, same packages, same hardware (but serials and mac addresses).
>
>
> I didn't manage to find anything about leak in dom0 ending up with oom
> killer without doubt.
>
> I tried to gather as much log as i thought could be helpful in attachments.
> Host bk - about to get a reboot, as xend already got killed
> Host sw - 800MB/2GB used for nothing,
>
> Attachments contains :
>
> - memory graph (by munin) - it might help to see the pattern of memory
> usage
>
> cat from :
> - grub.cfg
> - /proc/meminfo
> - /proc/slabinfo
> - /proc/vmstat
> - /var/log/kern.log
> - /var/log/xen/xend.log
>
> Result from :
> - dmesg
> - dpkg -l
> - free
> - lsmod
> - top
> - vmstat
> - xm info
> - xm info -c
>
>
> I'd appreciate any feedback about such behaviour, and would be happy to
> provide additional information.
> Those are productions servers, the only thing i'd really like to avoid
> as much as possible is rebooting them for tests.
>
>
> Regards,
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|