This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: Question about Xen S3 and resume code - Linux dom0 never

To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Subject: [Xen-devel] Re: Question about Xen S3 and resume code - Linux dom0 never exits the xen_safe_halt hypercall after resume
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Mon, 20 Jun 2011 08:36:26 -0400
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>
Delivery-date: Mon, 20 Jun 2011 05:39:59 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <625BA99ED14B2D499DC4E29D8138F1505D2C2DD530@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20110616225739.GA8714@xxxxxxxxxxxx> <625BA99ED14B2D499DC4E29D8138F1505D2C2DD530@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.21 (2010-09-15)
> ideally ACPI S3/S5 has nothing to do with ACPI processor driver which is for 
> Cx/Px.

> > 
> > (which is in the devel/acpi-s3.v0 branch).
> > 
> > the hypervisor, after an S3 resume sits forever in the default_idle. The
> > Linux dom0 is stuck looping (I think) around SCHEDOP_block hypercall.
> > 
> > http://darnok.org/xen/devel.acpi-s3.v1.serial.log
> > 
> > If that patch above is present and I've cpufreq=xen on the Xen
> > hypervisor then Linux kernel gets unstuck and returns to userspace:
> > 
> > http://darnok.org/xen/devel.acpi-s3.v0.serial.log
> Compare your logs, the major difference is:
> [  168.754739] calling  i2c-8+ @ 3096
> [  168.758200] call i2c-8+ returned 0 after 0 usecs
> <<< 1st case stuck here
> [  168.762882] calling  card0-VGA-1+ @ 3096
> [  168.766867] call card0-VGA-1+ returned 0 after 0 usecs
> [  168.772085] calling  ttm+ @ 3096
> [  168.775360] call ttm+ returned 0 after 0 usecs
> [  168.779870] PM: resume of devices complete after 13117.603 msecs
> [  168.786006] PM: Finishing wakeup.
> <<<2nd case forward progress
> It looks that VGA card resume has some problem on resume, which then

In both cases - with the patch and without..

> makes dom0 stay in idle loop and thus block hypercall, and then due to
> no runnable vcpu so Xen most time in idle_loop too. In earlier log there're
> some stack trace in i915 driver. Perhaps you can try a different machine

Or remove the i915 just to eliminate that.
> or try native S3 on same box to make sure it's not mixed with native issues.
> > 
> > (however, if I set cpuidle=0 cpufreq=none on the hypervisor line and
> > have the 9f301b0a0081676dfc71b7f0898295e6bcba391a patch it still
> > gets stuck).
> > 
> > I figured that the primary reason the guest is allowed to
> > exit is SCHEDOP_block loop is b/c the pm_idle call is set to the
> > acp_processor_idle which does "something" extra after the machine comes
> > out of a S3 suspend.
> If that's the case I think you should disable CONFIG_ACPI_PROCESSOR in dom0
> before incorporating Xen specific version (the patch you tried). We don't want
> dom0 to play with Cx directly b/c it's the responsibility of Xen.

Huh? You misunderstood me. The 'acpi_processor_idle' is the hypervisor's
idle loop. It can be running inside of that one, or the 'default_idle' loop. 
my question why would that specific hypervisor idle loop make dom0 run nicely
while the default one would not.

In dom0, irregardless of the patches, the 'default_idle' is run which makes the
xen_safe_halt paravirt call.

> Of course we still need figure out why same issues occur with cpuidle=0/
> cpufreq=none, which however can be revisited after the basic S3 works. :-)

Right. The end result of those parameters is that the 'default_idle' in the
hypervisor is choosen instead of the 'acpi_processor_idle' one.
> > 
> > Any ideas?
> No other ideas for now. From historical view Xen S3 was supported before

Hmm, I am actually tempted to start commenting out code in the 
and seeing what will cause it to have the same failure as 'default_idle'.

Xen-devel mailing list