Last night's run ran for over 15 hours before the same
"blocked for more than 480 seconds" occurred. This
time the tmem patch was running so I/O was greatly
reduced, which might account for the change in behavior
(or it might be completely random).
Interestingly, the domain isn't completely frozen.
It is still doing some things but is mostly non-responsive.
I was able to do a ctrl-Z on the console and get the
normal shell response, but then no prompt. I am also
able to see stuff by sending it sysrq's using xm.
I'll give cpuidle=off a spin this weekend but...
> Hmm could be the kernel I suppose.
Yes, this article would lead me to believe so:
I'll also try to reproduce on 2.6.18. If I can't, I'd
chalk it up as a kernel problem.
> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx]
> Sent: Friday, April 17, 2009 2:13 AM
> To: Keir Fraser; Dan Magenheimer; xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: RE: [Xen-devel] Second release candidate for Xen 3.4.0
> >From: Keir Fraser
> >Sent: 2009年4月17日 16:06
> >On 17/04/2009 08:55, "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx> wrote:
> >> On 16/04/2009 18:09, "Dan Magenheimer"
> ><dan.magenheimer@xxxxxxxxxx> wrote:
> >>> FYI, I can still reproduce the "blocked for more than 480 seconds"
> >>> problem I reported yesterday. After running >2 hours of load,
> >>> the 2.6.29 guest spews out a number of Call Trace's and freezes.
> >>> Each is prefixed with:
> >> Hmm could be the kernel I suppose. Or perhaps there's a time
> >issue lurking.
> >And if the latter, the cpuidle stuff would still be most
> >likely culprit in
> >my opinion. Did you repro problems with cpuidle=off?
> I think Dan mentioned 'cpuidle=off' in his previous post, but
> of course
> it's worthy of further confirmation about this option:
> > > -----Original Message-----
> > > From: Dan Magenheimer
> > > Sent: Wednesday, April 15, 2009 8:59 AM
> > > To: Dan Magenheimer; Keir Fraser; Xen-Devel (E-mail); Tian, Kevin
> > > Subject: RE: [Xen-devel] Time goes backwards in dom0 in
> > >
> > >
> > > Hmmm... after only a few minutes with cpuidle=off,
> > > my test domPV froze up after printing a number of
> > > call traces starting with:
> > >
> > > INFO: task xxx:nnn blocked for more than 480 seconds.
> > >
> > > At the top of all of the traces is either
> > > getnstimeofday+51 or io_schedule+44.
> > >
> > > (Note that this PV domain is a 2.6.29 kernel... don't
> > > know if the messages are the same on an older kernel.)
Xen-devel mailing list