WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] RE: Question about Xen S3 and resume code - Linux dom0 n

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: RE: [Xen-devel] RE: Question about Xen S3 and resume code - Linux dom0 never exits the xen_safe_halt hypercall after resume
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Thu, 30 Jun 2011 13:34:27 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>
Delivery-date: Wed, 29 Jun 2011 22:35:19 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <625BA99ED14B2D499DC4E29D8138F1505D34A529A4@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20110616225739.GA8714@xxxxxxxxxxxx> <625BA99ED14B2D499DC4E29D8138F1505D2C2DD530@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <20110620123626.GA2973@xxxxxxxxxxxx> <625BA99ED14B2D499DC4E29D8138F1505D34A529A4@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcwvRrPeZu/fR35lSwiQr2MjXXEXmAAVicywAdJmWsA=
Thread-topic: [Xen-devel] RE: Question about Xen S3 and resume code - Linux dom0 never exits the xen_safe_halt hypercall after resume
Hi, Konrad,

any update on this S3 problem you're seeing?

I just got a chance to give a try on my Dell core-i7 platform with a Ubuntu 
10.10
system. 

Xen version is:
changeset:   23632:33717472f37e
tag:         tip
user:        Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
date:        Tue Jun 28 18:15:44 2011 +0100
summary:     libxc: Squash xc_e820.h (and delete) into xenctrl.h

for dom0 I use origin/master plus ACPI patches queued on your 
origin/devel/acpi-s3.v0:
commit 4aa69dc48e031276b4d771dcb227d553fd3def0b
Merge: df5b2b6 9f90a3b
Author: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date:   Tue Jun 21 09:34:31 2011 -0400

    Merge branch '3.0-rc1-rem_pg_reserve-4' of 
git://xenbits.xen.org/people/sstabellini/linux-pvhvm

w/ or w/o ACPI processor patches on my box ACPI S3 just works well.

Thanks
Kevin

> From: Tian, Kevin
> Sent: Tuesday, June 21, 2011 7:22 AM
> 
> > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx]
> > Sent: Monday, June 20, 2011 8:36 PM
> >
> > > ideally ACPI S3/S5 has nothing to do with ACPI processor driver which is 
> > > for
> > Cx/Px.
> >
> > Right..
> > >
> > > >
> > > > (which is in the devel/acpi-s3.v0 branch).
> > > >
> > > > the hypervisor, after an S3 resume sits forever in the default_idle. The
> > > > Linux dom0 is stuck looping (I think) around SCHEDOP_block hypercall.
> > > >
> > > > http://darnok.org/xen/devel.acpi-s3.v1.serial.log
> > > >
> > > > If that patch above is present and I've cpufreq=xen on the Xen
> > > > hypervisor then Linux kernel gets unstuck and returns to userspace:
> > > >
> > > > http://darnok.org/xen/devel.acpi-s3.v0.serial.log
> > >
> > > Compare your logs, the major difference is:
> > >
> > > [  168.754739] calling  i2c-8+ @ 3096
> > > [  168.758200] call i2c-8+ returned 0 after 0 usecs
> > > <<< 1st case stuck here
> > > [  168.762882] calling  card0-VGA-1+ @ 3096
> > > [  168.766867] call card0-VGA-1+ returned 0 after 0 usecs
> > > [  168.772085] calling  ttm+ @ 3096
> > > [  168.775360] call ttm+ returned 0 after 0 usecs
> > > [  168.779870] PM: resume of devices complete after 13117.603 msecs
> > > [  168.786006] PM: Finishing wakeup.
> > > <<<2nd case forward progress
> > >
> > > It looks that VGA card resume has some problem on resume, which then
> >
> > In both cases - with the patch and without..
> 
> that's expected since device suspend is always invoked in the S3 path.
> 
> >
> > > makes dom0 stay in idle loop and thus block hypercall, and then due to
> > > no runnable vcpu so Xen most time in idle_loop too. In earlier log 
> > > there're
> > > some stack trace in i915 driver. Perhaps you can try a different machine
> >
> > Or remove the i915 just to eliminate that.
> 
> So any result there? :-)
> 
> > > or try native S3 on same box to make sure it's not mixed with native 
> > > issues.
> > >
> > > >
> > > > (however, if I set cpuidle=0 cpufreq=none on the hypervisor line and
> > > > have the 9f301b0a0081676dfc71b7f0898295e6bcba391a patch it still
> > > > gets stuck).
> > > >
> > > > I figured that the primary reason the guest is allowed to
> > > > exit is SCHEDOP_block loop is b/c the pm_idle call is set to the
> > > > acp_processor_idle which does "something" extra after the machine
> comes
> > > > out of a S3 suspend.
> > >
> > > If that's the case I think you should disable CONFIG_ACPI_PROCESSOR in
> > dom0
> > > before incorporating Xen specific version (the patch you tried). We don't
> want
> > > dom0 to play with Cx directly b/c it's the responsibility of Xen.
> >
> > Huh? You misunderstood me. The 'acpi_processor_idle' is the hypervisor's
> > idle loop. It can be running inside of that one, or the 'default_idle' 
> > loop. Hence
> 
> running inside which one? I'd think only default_idle invokes it when current 
> cpu
> is actually idle.
> 
> > my question why would that specific hypervisor idle loop make dom0 run
> nicely
> > while the default one would not.
> 
> this is counterintuitive to me honestly speaking. I'd more think that
> acpi_processor_idle may cause some issue than pure "sti;hlt" because acpi
> version has more logic to handle. In earlier day when it's still in 
> stabilization
> phase, we did observe some non-exit case from deep Cstate but this never
> happens on pure hlt.
> 
> IOW, I don't take this idle path as a necessary step to make S3 resume 
> working,
> which is simply related when the cpu has nothing to do...
> 
> >
> > In dom0, irregardless of the patches, the 'default_idle' is run which makes 
> > the
> > xen_safe_halt paravirt call.
> 
> OK, that matches my expectation then.
> 
> >
> > >
> > > Of course we still need figure out why same issues occur with cpuidle=0/
> > > cpufreq=none, which however can be revisited after the basic S3 works. :-)
> >
> > Right. The end result of those parameters is that the 'default_idle' in the
> > hypervisor is choosen instead of the 'acpi_processor_idle' one.
> > >
> > > >
> > > > Any ideas?
> > >
> > > No other ideas for now. From historical view Xen S3 was supported before
> >
> > Hmm, I am actually tempted to start commenting out code in the
> > acpi_processor_idle
> > and seeing what will cause it to have the same failure as 'default_idle'.
> 
> you can also try "max_cstates=1" to see any difference, which is expected to
> has similar effect as safe_halt().
> 
> Thanks
> Kevin
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel