|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] Re: [Xen-users] old issue after 1024 live migrations se
Hi Ian & list,
I'll provide the specifics of my config, sure
2010/7/22 Ian Campbell <Ian.Campbell@xxxxxxxxxx>:
> (dropping xen-users to avoid cross-posting)
> Do you have a reference to this old issue?
I googled for the old mailing list post, but no luck with the traffic
on the Xen lists.
Firstofall, I'm glad if it's a different bug and doesn't exist for
most people :)
> To be honest I think it is unlikely that you are seeing the actual same
> issue as a bug that old, even if your symptoms are very similar.
>
> Can you give details of your precise system configuration for both host
> and guest, hypervisor changeset (I don't know what Oracle VM 2.0 has in
> it), kernel changeset for both dom0 and domU etc.
dom0 (both identical)
xen_major : 3
xen_minor : 4
xen_extra : .0
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p
xen_scheduler : credit
xen_pagesize : 4096
platform_params : virt_start=0xff400000
xen_changeset : unavailable
[root@waxh0004 ~]# uname -a
Linux waxh0004 2.6.18-128.2.1.4.9.el5xen #1 SMP Fri Oct 9 14:57:31 EDT
2009 i686 i686 i386 GNU/Linux
domU:
debian:~# uname -a
Linux debian 2.6.26-2-xen-686 #1 SMP Wed Nov 4 23:23:33 UTC 2009 i686 GNU/Linux
(debian lenny from stacklet.com, kernel date was nov9 09)
> I am currently doing some live migration testing with guests under load
> (forkbomb) and am regularly doing 4-5000 successful migrations before I
> hit a very subtle deadlock in a PVops domU kernel. I have most likely in
> the past 4-5 years personally done tens of thousands of iterations of
> live migration in various scenarios and we know other people are
> regularly doing automated and manual test of these things so the problem
> you are seeing is almost certainly not a generic failure but must be
> specific to the version of one or more components in your system.
good!
> Are you seeing failure after precisely 1024 migrations in every case or
> is that just a rough figure? It might be worth
no, it was more like "just above 1000", I also had some counter
problem in the script.
Note that before that a few times the migration ended with a domU was
down. so your below hint / leak might just be the thing.
> using /usr/lib/xen/bin/lsevtchn to check what is happening to both the
> dom0 and domU event channels after each migration iteration. Once upon a
okay, will log that
> time I was seeing an evtchn leak in domU (now fixed) but that wouldn't
> fail after precisely 1024 iterations since there is always a number of
> non-leaking event channels also in use.
>
> Are you able to test with an up to date xen-3.4-testing or even better
> the xen-4.0-testing tree?
Retesting with Xen 4 would be a bit tricky. Oracle has an SDK domU
that has all the dom0 sources, would still take a day of work I'm
afraid.
I'd hope some other people can do the testing on other versions, thats
what I asked and what I didn't send to xen-devel in the first place.
I fixed lan management access to one of te hosts (for serial
console/reboot/reset...) so on that one I could try re-testing with
3.4 testing.
If the issue doesn't show up in your tests then I agree it's probably
just in the specific version - in that case I can just inform oracle
and they can look into it on their own.
>> > is it just the gratious arp?
>
> The Grat. ARP doesn't get sent by current PVops kernels (I don't know if
> you are using this since you haven't provided any details about your
> system configuration). A fix is pending in the network subsystem
I know I didn't. Because I just asked for someone else to run the
script and retest ;p
> maintainers tree which I hope will be backported to to 2.6.32.x when it
> goes into mainline during the next merge window.
> See 06c4648d46d1b757d6b9591a86810be79818b60c and
> 592970675c9522bde588b945388c7995c8b51328 in net-next-2.6.git. You will
> also need to configure sysctl to enable the arp_notify option for the
> devices setting net.ipv4.conf.all.arp_notify = 1 is likely sufficient.
classic domU kernel
I'll try if I get a newer dom0 kernel to work, but I'll be on vacation
for a week now.
Considering that you successfully migrate a few thousand times I'd
suggest you forget about the issue until then.
Greetings,
Flo
--
'Sie brauchen sich um Ihre Zukunft keine Gedanken zu machen'
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|