This is a long shot, but since my thoughts jumped to it after
reading this, I thought I'd post anyway.
Some systems support a special "C1E" power state
that can be enabled/disabled in the BIOS. My Dell Core2Duo
laptop has this feature. I remember running into
some weirdness that went away when I turned it off.
Perhaps the power management code is somehow entering
the BIOS to see if this is enabled and max_cstate isn't
controlling it since the check is done in the BIOS
Google for C1E to find lots of information about
this weird power state.
> -----Original Message-----
> From: Roger Cruz [mailto:roger.cruz@xxxxxxxxxxxxxxxxxxx]
> Sent: Monday, October 04, 2010 8:19 AM
> To: Jan Kiszka
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Konrad Rzeszutek Wilk
> Subject: [Xen-devel] RE: How to generate a HW NMI
> Until Friday, all hard hangs that we and our customers had experienced
> were on Lenovo T500 and X200, even with their latest BIOSes. The
> T400 has never hung for me and I don't have any reports on them from
> field. On Friday, I had an HP i5 hard hang with similar footprint as
> the Lenovos. When this hard hang happens, the Xen watchdog (which is
> driven by the NMI handler) will not do its job and cause a crash/stack
> trace. This is why we have started to suspect something with the BIOS
> and SMIs as they are the only thing that can block an NMI. I am pretty
> certain that this is somehow related to entering C3 power states and
> possibly at the same time an SMI comes in. The time it takes to hang
> varies from 30mins to 24 hrs.
> -----Original Message-----
> From: Jan Kiszka [mailto:jan.kiszka@xxxxxxxxxxx]
> Sent: Monday, October 04, 2010 10:13 AM
> To: Roger Cruz
> Cc: Konrad Rzeszutek Wilk; xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: Re: How to generate a HW NMI
> Am 04.10.2010 15:56, Roger Cruz wrote:
> > Jan,
> > I will try your suggestion of turning off SMIs. I am also interested
> in you
> > conducting an experiment for me. If you can, please tell your kernel
> not to use
> > any CPU power saving modes. In Xen I use max_cstate=0 in the
> I have
> > found that when I do this, the hangs appear to go away (we had one
> > report one since using this work-around, so it is not 100% working).
> Will do. My customer reported that he was able to easily crash his i7
> notebook by pulling and re-plugging the power cable. I bet all of these
> events are trapped by the BIOS via power management SMIs...
> BTW, do you see any correlation between crashable boxes and BIOS
> vendors? We have no representative numbers yet, just one confirmed
> instable notebook that is Phoenix-based, while one AMI-based i7 server
> that is rock-stable.
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.856 / Virus Database: 271.1.1/3168 - Release Date:
> Xen-devel mailing list
Xen-devel mailing list