|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Watchdog and live migration
>>> On 16.03.12 at 15:58, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:
> On Fri, 2012-03-16 at 14:34 +0000, Jan Beulich wrote:
>> >>> On 16.03.12 at 13:32, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:
>> > I think this is a bug, moving to xen-devel@ and CCing the drivers author
>> > (Hi Jan).
>> >
>> > On Fri, 2012-03-16 at 12:29 +0000, Wouter de Geus wrote:
>> >> * Ian Campbell <Ian.Campbell@xxxxxxxxxx> [2012-03-16 10:55:34 +0000]:
>> >>
>> >> > > In the domU a /dev/watchdog exists, but I never touch it myself.
>> >> >
>> >> > Does your distro start a watchdogd for you?
>> >> >
>> >> > Even if it did then it should obviously continue to poke the watchdog
>> >> > even after a migration, but it might provide some hints.
>> >>
>> >> Nope, it doesn't.
>> >> As an experiemtn I tried sending a 'V' to /dev/watchdog just after
>> >> migration, which seems to be the magic byte to stop the watchdog
>> >> according to the linux kernel docs.
>> >>
>> >> (according to Documentation/watchdog/watchdog-api.txt) -
>> >> "If a driver supports "Magic Close", the driver will not disable the
>> >> watchdog unless a specific magic character 'V' has been sent to
>> >> /dev/watchdog just before closing the file."
>> >>
>> >> Surprisingly, the domU seemed to stay alive after doing this.
>> >> So it would appear that somehow the watchdog is triggered after migration.
>> >> No idea why though.
>> >
>> > drivers/watchdog/xen_wdt.c:xen_wdt_resume() unconditionally calls
>> > xen_wdt_start(). Shouldn't it only do this if the wdt is active?
>>
>> Could you give the patch below a try?
>>
>> Jan
>>
>> --- a/drivers/watchdog/xen_wdt.c
>> +++ b/drivers/watchdog/xen_wdt.c
>> @@ -297,11 +297,19 @@ static void xen_wdt_shutdown(struct plat
>>
>> static int xen_wdt_suspend(struct platform_device *dev, pm_message_t state)
>> {
>> - return xen_wdt_stop();
>> + typeof(wdt.id) id = wdt.id;
>
> typeof here is a bit odd.
But I want to match the original field's type.
>> + int rc = xen_wdt_stop();
>> +
>> + wdt.id = id;
>> +
>> + return rc;
>> }
>>
>> static int xen_wdt_resume(struct platform_device *dev)
>> {
>> + if (!wdt.id)
>> + return 0;
>
> Can't you check is_active instead and avoid having to play tricks in
> xen_wdt_suspend to preserve a non-0 wdt.id when the watchdog is active?
I first thought of this too, but is_active doesn't represent whether
a watchdog is actually engaged - it merely says whether the
watchdog device is currently open (but watchdog setup itself
may have failed).
Looking at the code for this again, I see another problem though:
is_active gets cleared in xen_wdt_release() even when a release
was not expected (similar to how iTCO_wdt, pcwd_pci, or pcwd_usb
behave, but imo buggy nevertheless), or when xen_wdt_stop()
failed.
Jan
>> + wdt.id = 0;
>> return xen_wdt_start();
>> }
>>
>>
>>
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |