WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Why does my DomU keep going mad?

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] Why does my DomU keep going mad?
From: Lyle <webmaster@xxxxxxxxxxxxxx>
Date: Mon, 26 Jul 2010 14:48:08 +0100
Delivery-date: Mon, 26 Jul 2010 06:49:20 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4C4D8F7A.20007@xxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <4C4D8D8A.80702@xxxxxxxxxxxxxx> <4C4D8F7A.20007@xxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.1.11) Gecko/20100711 Thunderbird/3.0.6
On 26/07/2010 14:36, Steve Spencer wrote:
Lyle wrote:
Hi All,
   I've got a DomU that sometimes goes mad. I can't ssh or usually even
console to it. The time I did manage to console I got a load of dumps
about being out of memory and swap, but couldn't run any commands to
find out which process had gone mad :(
   From Dom0 I can see the DomU at 100% CPU and can only stop it with a
destroy. What can I do/check to find out why this happens? Sometimes
it'll be fine for weeks on end, others it'll go wrong almost every day.
The servers average load is very low, around 0.1. I assume there is a
process that goes wild for whatever reason, but no idea where to start
to track it down :(
   I'm running the latest CentOS, any help much appreciated.


Lyle


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


Lyle,

What services does this DomU run?  In other words is it a mail server,
web server, radius, etc?  What can you tell us about the DomU that would
be of help to us helping you?

Here is an abridged ps aux, I cut out what look like duplicates. Is there a way of setting some kind of process logging to trigger once the CPU % goes over 90%?

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1  10348   600 ?        Ss   06:14   0:00 init [3]
root 2 0.0 0.0 0 0 ? S< 06:14 0:00 [migration/0] root 3 0.0 0.0 0 0 ? SN 06:14 0:00 [ksoftirqd/0] root 4 0.0 0.0 0 0 ? S< 06:14 0:00 [watchdog/0]
root         5  0.0  0.0      0     0 ?        S<   06:14   0:00 [events/0]
root         6  0.0  0.0      0     0 ?        S<   06:14   0:00 [khelper]
root         7  0.0  0.0      0     0 ?        S<   06:14   0:00 [kthread]
root         9  0.0  0.0      0     0 ?        S<   06:14   0:00 [xenwatch]
root        10  0.0  0.0      0     0 ?        S<   06:14   0:00 [xenbus]
root 14 0.0 0.0 0 0 ? S< 06:14 0:00 [migration/1] root 15 0.0 0.0 0 0 ? SN 06:14 0:00 [ksoftirqd/1] root 16 0.0 0.0 0 0 ? S< 06:14 0:00 [watchdog/1]
root        17  0.0  0.0      0     0 ?        S<   06:14   0:00 [events/1]
root        20  0.0  0.0      0     0 ?        S<   06:14   0:00 [kblockd/0]
root        21  0.0  0.0      0     0 ?        S<   06:14   0:00 [kblockd/1]
root        22  0.0  0.0      0     0 ?        S<   06:14   0:00 [cqueue/0]
root        23  0.0  0.0      0     0 ?        S<   06:14   0:00 [cqueue/1]
root        27  0.0  0.0      0     0 ?        S<   06:14   0:00 [khubd]
root        29  0.0  0.0      0     0 ?        S<   06:14   0:00 [kseriod]
root 94 0.0 0.0 0 0 ? S 06:14 0:00 [khungtaskd]
root        95  0.0  0.0      0     0 ?        S    06:14   0:00 [pdflush]
root        96  0.0  0.0      0     0 ?        S    06:14   0:00 [pdflush]
root        97  0.0  0.0      0     0 ?        S<   06:14   0:01 [kswapd0]
root        98  0.0  0.0      0     0 ?        S<   06:14   0:00 [aio/0]
root        99  0.0  0.0      0     0 ?        S<   06:14   0:00 [aio/1]
root       229  0.0  0.0      0     0 ?        S<   06:14   0:00 [kpsmoused]
root       254  0.0  0.0      0     0 ?        S<   06:14   0:00 [kstriped]
root       267  0.0  0.0      0     0 ?        S<   06:14   0:00 [ksnapd]
root       282  0.0  0.0      0     0 ?        S<   06:14   0:00 [kjournald]
root       304  0.0  0.0      0     0 ?        S<   06:14   0:00 [kauditd]
root 332 0.0 0.0 12604 348 ? S<s 06:14 0:00 /sbin/udevd -d
root       664  0.0  0.0      0     0 ?        S<   06:14   0:00 [kmpathd/0]
root       665  0.0  0.0      0     0 ?        S<   06:14   0:00 [kmpathd/1]
root 666 0.0 0.0 0 0 ? S< 06:14 0:00 [kmpath_handle]
root       688  0.0  0.0      0     0 ?        S<   06:14   0:00 [kjournald]
root      1067  0.0  0.1  27348   696 ?        S<sl 06:15   0:00 auditd
root 1069 0.0 0.1 81800 760 ? S<sl 06:15 0:00 /sbin/audispd root 1089 0.0 0.1 5908 532 ? Ss 06:15 0:00 syslogd -m 0
root      1092  0.0  0.0   3804   324 ?        Ss   06:15   0:00 klogd -x
root      1101  0.0  0.0  10760   316 ?        Ss   06:15   0:00 irqbalance
named 1138 0.0 1.2 166536 6728 ? Ssl 06:15 0:01 /usr/sbin/named
rpc       1171  0.0  0.0   8052   408 ?        Ss   06:15   0:00 portmap
root      1215  0.0  0.0      0     0 ?        S<   06:15   0:00 [rpciod/0]
root      1216  0.0  0.0      0     0 ?        S<   06:15   0:00 [rpciod/1]
rpcuser   1223  0.0  0.1  10160   564 ?        Ss   06:15   0:00 rpc.statd
root      1245  0.0  0.0  55180   236 ?        Ss   06:15   0:00 rpc.idmapd
dbus 1258 0.0 0.1 21356 852 ? Ss 06:15 0:00 dbus-daemon --s root 1266 0.0 0.0 10432 376 ? Ss 06:15 0:00 /usr/sbin/hcid root 1272 0.0 0.0 5936 392 ? Ss 06:15 0:00 /usr/sbin/sdpd
root      1294  0.0  0.0      0     0 ?        S<   06:15   0:00 [krfcommd]
root      1329  0.0  0.0  21040   524 ?        Ssl  06:15   0:00 pcscd
root 1347 0.0 0.0 8516 364 ? Ss 06:15 0:00 /usr/bin/hidd -
root      1380  0.0  0.1  54396   836 ?        Ssl  06:15   0:00 automount
root 1399 0.0 0.1 63516 532 ? Ss 06:15 0:00 /usr/sbin/sshd
root      1407  0.0  0.1 134096   952 ?        Ss   06:15   0:00 cupsd
root 1419 0.0 0.1 21644 540 ? Ss 06:15 0:00 xinetd -stayali root 1430 0.0 0.0 44268 188 ? Ss 06:15 0:00 /usr/sbin/vsftp root 1462 0.0 0.1 65980 996 ? S 06:15 0:00 /bin/sh /usr/bi mysql 1509 0.0 0.8 191260 4308 ? Sl 06:15 0:00 /usr/libexec/my postgres 1589 0.0 0.2 120740 1344 ? S 06:15 0:00 /usr/bin/postma root 1600 0.0 0.0 6060 500 ? Ss 06:15 0:00 /usr/sbin/dovec root 1608 0.0 0.2 62500 1300 ? S 06:15 0:00 dovecot-auth
dovecot   1612  0.0  0.2  33892  1300 ?        S    06:15   0:00 imap-login
postgres 1615 0.0 0.0 109920 176 ? S 06:15 0:00 postgres: logge nobody 1622 0.0 31.0 212288 163000 ? Ssl 06:15 0:06 clamd.virtualmi postgrey 1632 0.0 1.0 111480 5380 ? Ss 06:15 0:00 /usr/sbin/postg root 1684 0.0 0.3 54144 1828 ? Ss 06:15 0:00 /usr/libexec/po postfix 1691 0.0 0.3 55160 1932 ? S 06:15 0:00 qmgr -l -t fifo root 1701 0.0 0.0 6452 256 ? Ss 06:15 0:00 gpm -m /dev/inp postfix 1733 0.0 0.3 54204 1868 ? S 06:15 0:00 tlsmgr -l -t un root 1743 0.0 0.6 319152 3244 ? Ss 06:15 0:00 /usr/sbin/httpd apache 1746 0.0 0.0 249564 444 ? S 06:15 0:00 /usr/sbin/httpd
root      1752  0.0  0.1  74860   724 ?        Ss   06:15   0:00 crond
root      1763  0.0  0.0  49764   420 ?        Ss   06:15   0:00 squid -D
squid     1765  0.0  0.5  52236  3128 ?        S    06:15   0:00 (squid) -D
squid     1767  0.0  0.0   3644   184 ?        Ss   06:15   0:00 (unlinkd)
apache 1779 0.0 0.0 319064 424 ? S 06:15 0:00 /usr/sbin/fcgi- sympa 1780 0.0 4.3 258180 22672 ? S 06:15 0:01 /usr/bin/perl - xfs 1796 0.0 0.1 20260 568 ? Ss 06:15 0:00 xfs -droppriv - root 1811 0.0 0.0 18732 352 ? Ss 06:15 0:00 /usr/sbin/atd root 1819 0.0 0.0 46740 304 ? Ss 06:15 0:00 /usr/sbin/sasla sympa 1833 0.0 4.1 230640 21652 ? S 06:15 0:01 /usr/bin/perl - avahi 1840 0.0 0.1 24172 1032 ? Ss 06:15 0:00 avahi-daemon: r
68        1849  0.0  0.1  30428   976 ?        Ss   06:15   0:00 hald
root      1850  0.0  0.1  21692   532 ?        S    06:15   0:00 hald-runner
mailman 1866 0.0 0.1 149556 692 ? Ss 06:15 0:00 /usr/bin/python root 1898 0.0 0.7 257084 3736 ? SN 06:15 0:00 /usr/bin/python root 1900 0.0 0.1 12916 852 ? SN 06:15 0:00 /usr/libexec/ga root 1927 0.0 0.0 0 0 ? Z 06:16 0:00 [sen] <defunct> root 2046 0.0 0.3 125808 1740 ? Ss 06:16 0:00 /usr/libexec/we root 2056 0.0 0.0 18416 240 ? S 06:16 0:00 /usr/sbin/smart root 2069 0.0 0.1 52108 892 ? Ss 06:16 0:00 login -- root postfix 13030 0.0 0.5 54348 2636 ? S 08:12 0:00 trivial-rewrite postfix 17743 0.0 0.4 54208 2240 ? S 09:14 0:00 pickup -l -t fi postfix 18474 0.0 0.4 54204 2288 ? S 09:24 0:00 anvil -l -t uni postfix 19504 0.0 0.5 54428 2660 ? S 09:35 0:00 local -t unix
dovecot  19540  0.0  0.3  33884  1632 ?        S    09:35   0:00 pop3-login
postfix 19543 0.0 0.8 72672 4492 ? S 09:35 0:00 smtpd -n smtp - postfix 19660 0.0 0.5 54468 2684 ? S 09:40 0:00 cleanup -z -t u postfix 19947 0.0 0.8 72672 4484 ? S 09:41 0:00 smtpd -n smtp -


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users