Hi, Dan
We still can't reproduce this failure locally, even with Merom laptop. Do
you have the dock with your Dell 630, and I think the dock should have the
serial port support, and maybe you can get the failure log through it. If we
can get the failure log, it should be helpful to identify this issue. Also I
have analyzed the Cset #20072 and Cst20073, and have no any clue which can lead
to this issue. In addition, I also talked with Ke, he said he could reproduce
another issue related to hwclock, but for this issue, he also can't catch it in
any platforms. :(
Thanks!
Xiantao
-----Original Message-----
From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
Sent: Wednesday, December 09, 2009 11:50 PM
To: Zhang, Xiantao; Yu, Ke; Xen-Devel (E-mail)
Subject: RE: latest xen-unstable fails to boot on Dell D630 (likely hpet/Cstate
problem)
> Could you attach the failure log ?
I can't get any failure logs because dom0 fails to boot.
The failure conditions are the same as described
here:
http://lists.xensource.com/archives/html/xen-devel/2009-10/msg01027.html
However, I have attached the xm dmesg output from
a successful boot (with max_cstate=2).
> In addition, does this system have ioapic support ?
I think so. See attached log.
> I think hpet doesn't use MSI, right ?
I don't think so.
Dan
> -----Original Message-----
> From: Zhang, Xiantao [mailto:xiantao.zhang@xxxxxxxxx]
> Sent: Tuesday, December 08, 2009 9:40 PM
> To: Dan Magenheimer; Yu, Ke; Xen-Devel (E-mail)
> Subject: RE: latest xen-unstable fails to boot on Dell D630 (likely
> hpet/Cstate problem)
>
>
> Dan Magenheimer wrote:
> > FYI, 20073+20093+20149 boots properly and xend starts
> > WITH max_cstate=2, but dom0 FAILs to boot unless
> > max_cstate=2 is added as a Xen boot parameter.
>
> Could you attach the failure log ? In addition, does this
> system have ioapic support ? I think hpet doesn't use MSI, right ?
> Xiantao
>
>
> > So I still think something changed at 20073 that
> > causes Merom+RHEL5dom0 to fail to boot due to not
> > recovering from deep C-state (after dom0 runs
> > /sbin/hwclock ... Ke Yu knows how to reproduce
> > the problem).
> >
> > Thanks,
> > Dan
> >
> >> -----Original Message-----
> >> From: Zhang, Xiantao [mailto:xiantao.zhang@xxxxxxxxx]
> >> Sent: Tuesday, December 08, 2009 6:44 PM
> >> To: Dan Magenheimer; Yu, Ke; Xen-Devel (E-mail)
> >> Subject: RE: latest xen-unstable fails to boot on Dell D630 (likely
> >> hpet/Cstate problem)
> >>
> >>
> >> Dan,
> >> Don't use Cset20073 for testing separately, since it needs
> >> two minor fixes check-ined by the Cset #20093 and #20149.
> >> Except this, Keir also has a typo in Cset #20076 fixed by
> >> Cset #20092. In addition, one serious issue is also
> >> introduced in #Cset20084 which is fixed in Cset #20140. I
> >> remembered Pod also has issues which can crash hypervisor
> >> before Cset #20100. Thus, it is too hard to identify this
> >> issue through bisect before #Cset20149, since these issues
> >> are introduced and fixed crossedly. Certainly, if you want
> >> to test Cset #20073, you at least have to apply the
> >> Cset#20093 and #20149 on top of it. :)
> >> Xiantao
> >>
> >>
> >> Dan Magenheimer wrote:
> >>>> But I'll give bisecting a try.
> >>>
> >>> Looks like the problem has been around for awhile. It appears
> >>> the problem starts at c/s 20073. Xiantao cc'ed since
> 20073 was his
> >>> patch.
> >>>
> >>> 20070 boots OK without max_cstate=2
> >>>
> >>> 20072 boots most of the way without max_cstate=2 but crashes
> >>> before a login prompt (when xend is starting I think)
> >>>
> >>> 20073 FAILS to boot without max_cstate=2 but crashes
> before a
> >>> login prompt
> >>>
> >>> 20082 FAILS to boot without max_cstate=2 but crashes
> >>> before a login prompt with max_cstate=2
> >>>
> >>> 20143 FAILS to boot without max_cstate=2 but boots OK with
> >>> max_cstate=2
> >>>
> >>> Note that I have NOT bisected tools, just the hypervisor
> >>> so the crashes are likely due to a newer xend failing on
> >>> an older hypervisor (which is irrelevant to this problem).
> >>>
> >>>> -----Original Message-----
> >>>> From: Dan Magenheimer
> >>>> Sent: Tuesday, December 08, 2009 10:42 AM
> >>>> To: Yu, Ke; Xen-Devel (E-mail)
> >>>> Subject: RE: latest xen-unstable fails to boot on Dell D630
> >>>> (likely hpet/Cstate problem)
> >>>>
> >>>>
> >>>>> case, if convenient, could you help to do some bisect to see
> >>>>> which cset cause this bug?
> >>>>
> >>>> I can do this, but because it is often no longer easy to
> >>>> bisect Xen because of interdependencies with other
> >>>> components, I was hoping that Keir or you or someone might
> >>>> have some idea of what changeset might have caused the
> regression.
> >>>> But I'll give bisecting a try.
> >>>>
> >>>>> max_cstate=2), when dom0 hangs, is xen still alive, E.g. can
> >>>>> Xen response to three Ctrl-'A' in serial?
> >>>>
> >>>> Unfortunately, I can't seem to get a Xen console working on
> >>>> the Merom machine, and the problem can't be reproduced on
> >>>> my other machine where the Xen console is working (because
> >>>> Conroe doesn't support deep C).
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Yu, Ke [mailto:ke.yu@xxxxxxxxx]
> >>>>> Sent: Tuesday, December 08, 2009 12:08 AM
> >>>>> To: Dan Magenheimer; Xen-Devel (E-mail)
> >>>>> Subject: RE: latest xen-unstable fails to boot on Dell D630
> >>>>> (likely hpet/Cstate problem)
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> In this thread, I observed that I was unable to
> >>>>>> provoke deep C state (C3) on my Dell D630, which has
> >>>>>> a Intel Merom (dual-core laptop) processor. At that
> >>>>>> time, when I tried enabling hpetbroadcast, dom0 boot failed.
> >>>>>>
> >>>>>> http://lists.xensource.com/archives/html/xen-devel/2009-10/ms
> >>>>>> g01027.html
> >>>>>>
> >>>>>> As it turned out, all RHEL5-based (maybe RHEL4- also) dom0
> >>>>>> default installation run /sbin/hwclock, which IIRC takes
> >>>>>> the RTC away from Xen and gives it to dom0. Since the
> >>>>>> Xen hpet emulation does not do RTC emulation, bad things
> >>>>>> then happen when a deep Cstate is entered (dom0 apparently
> >>>>>> never wakes up). I think Ke Yu has also reproduced
> this problem.
> >>>>>>
> >>>>>> Sometime in the last few weeks, some patch in xen-unstable
> >>>>>> apparently changed some defaults and xen-unstable will
> >>>>>> no longer boot with this processor/dom0, with or without
> >>>>>> hpetbroadcast on the Xen command line. However, specifying
> >>>>>> max_cstate=2 on the Xen command line allows a successful
> >>>>>> dom0 boot, so I suspect the problem is the same (or at
> >>>>>> least very similar).
> >>>>>>
> >>>>>> I did a quick scan for hpet changes and found c/s 20497,
> >>>>>> but backing it out made no difference.
> >>>>>>
> >>>>>> I have a workaround for now, but since it is likely that
> >>>>>> many customers (including all of Oracle's OVS customers)
> >>>>>> use a RHEL5-based dom0 boot sequence, and Merom processors
> >>>>>> work fine otherwise, it would be nice to get this identified
> >>>>>> and fixed before 4.0.
> >>>>>
> >>>>> Let's firstly figure out which component the issue resides.
> >>>>>
> >>>>> Firstly, in the default boot (i.e. without specifying
> >>>>> max_cstate=2), when dom0 hangs, is xen still alive, E.g. can
> >>>>> Xen response to three Ctrl-'A' in serial?
> >>>>>
> >>>>> If only dom0 hangs, it is probably that RTC malfunction make
> >>>>> incorrect dom0 time and lead dom0 fail to boot. Then RTC
> >>>>> emulation in hypervisor should fix this issue.
> >>>>>
> >>>>> If Xen also hangs, it should be another bug, i.e. hpet
> >>>>> broadcast does not wake up CPU in deep C states. in this
> >>>>> case, if convenient, could you help to do some bisect to see
> >>>>> which cset cause this bug?
> >>>>>
> >>>>> Best Regards
> >>>>> Ke
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|