Keir -
I have put printk's with timestamps at the
beginning of domain_kill() and right before the 'return -EAGAIN' call, to make
sure that preemption is actually taking place. I can see, by viewing xm
dmesg, that this code gets used often during the shutdown, more and more as the
memory of the vm is increased. This is true on small commodity systems as
well as the ES7000.
However, I have noticed something
else. The original unresponsiveness was observed by having a shutdown in
one terminal window, and xm top with a delay of 1 second in another. As
soon as xm top stopped updating the clock, the overall system responsiveness
stopped with it.
If I do not use xm top while the vm is
shutting down, there are fewer or no problem with system responsiveness.
- If I run xm list (or xm info or xm dmesg) during a shutdown, then nothing
gets listed, but I have control over all the terminals and I can ctrl-C my way
out of the xm command. - If I do a shutdown and a non xm command such
as ping, I can ping another domain and I have no responsive problems at
all with zero packets lost.
It appears that the error is in the
combination of xm commands and shutdown. Running xm top with shutdown
shows this behavior the most.
At this point I am trying to find out
exactly when things get stuck for xm top and
shutdown.
Luke Szymanski Unisys
Corp.
From: Keir
Fraser [mailto:Keir.Fraser@xxxxxxxxxxxx] Sent: Wednesday, September 12, 2007 3:35
AM To: Krysan, Susan;
yamahata@xxxxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx Subject: Re: [Xen-devel] [PATCH 0/3]
continuable destroy domain:MakeXEN_DOMCTL_destroydomain hypercall
continuable
That’s quite surprising. You should
add tracing to domain_kill() and find out how many times it gets preempted, how
long it runs for between preemptions, etc. and hence find out whether the
preemption logic is working at all for you.
-- Keir
On
11/9/07 22:57, "Krysan, Susan" <KRYSANS@xxxxxxxxxx>
wrote:
I tested the patches
on Unisys
ES7000 x86_64
host
using 64-bit SLES10
paravirtualized domains. I
compared
unstable c/s 15730 and 15826
and unfortunately, the amount of time the host is unresponsive when
shutting down large vms
has not
changed. For example, on a
32x 128gb host, the shutdown of a
62gb vm causes 1 minute and 40 seconds of host unresponsiveness both before and
after the patches are applied. Also, I noticed that not only does
the amount of
unresponsive
host
time increase when the memory
size of the domains increase, but it also increases when the size of the
host memory
increases. Here are my
test results:
|