WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] RE: PV resume failed after self migration failed

To: <james.harper@xxxxxxxxxxxxxxxx>, xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] RE: PV resume failed after self migration failed
From: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>
Date: Fri, 24 Jun 2011 18:32:50 +0800
Cc:
Delivery-date: Fri, 24 Jun 2011 03:33:50 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
Importance: Normal
In-reply-to: <BLU157-w151648AF69095AFC43138CDA500@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <BAY0-MC4-F15zXiPuZe00229bef@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>, <BLU157-w269412E815CBCDF33B2A83DA6B0@xxxxxxx>, <AEC6C66638C05B468B556EA548C1A77D01D57A5D@trantor>, <BLU157-w36414995E6873400744316DA6A0@xxxxxxx>, <BLU157-w514C39E37ADBE61990B1E8DA6A0@xxxxxxx>, <AEC6C66638C05B468B556EA548C1A77D01D57AF2@trantor>, <BLU157-w4794936CB633D64890AC8ADA6F0@xxxxxxx>, <AEC6C66638C05B468B556EA548C1A77D01D57B39@trantor>, <BLU157-w3AD2F750539A0F6490F7CDA510@xxxxxxx>, <AEC6C66638C05B468B556EA548C1A77D01D57C34@trantor>, <BLU157-w56E447A5A99A3C7AA5B572DA500@xxxxxxx>, <BLU157-w151648AF69095AFC43138CDA500@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi James:
 
       In addtion, I think the if statement in XenVbd_HwScsiResetBus, we might need use
suspend_resume_state_fdo, not suspend_resume_state_pdo.
       Since suspend_resume_state_pdo is changed to SR_STATE_SUSPENDING, but there
are still io request not finished, when reset happen, those IO can be finished.
 
       What do u think?
        Thanks.
 
 
static BOOLEAN
XenVbd_HwScsiResetBus(PVOID DeviceExtension, ULONG PathId)
{
  PXENVBD_DEVICE_DATA xvdd = DeviceExtension;
  srb_list_entry_t *srb_entry;
  PSCSI_REQUEST_BLOCK srb;
  int i;
  UNREFERENCED_PARAMETER(DeviceExtension);
  UNREFERENCED_PARAMETER(PathId);
  FUNCTION_ENTER();
  KdPrint((__DRIVER_NAME "     IRQL = %d\n", KeGetCurrentIrql()));
  if (xvdd->ring_detect_state == RING_DETECT_STATE_COMPLETE && xvdd->device_state->suspend_resume_state_pdo == SR_STATE_RUNNING) *********this line
  {
    while((srb_entry = (srb_list_entry_t *)RemoveHeadList(&xvdd->srb_list)) != (srb_list_entry_t *)&xvdd->srb_list)
    {
      srb = srb_entry->srb;
      srb->SrbStatus = SRB_STATUS_BUS_RESET;
      KdPrint((__DRIVER_NAME "     completing queued SRB %p with status SRB_STATUS_BUS_RESET\n", srb));
      ScsiPortNotification(RequestComplete, xvdd, srb);
    }
 
 
>> Subject: RE: PV resume failed after self migration failed
>> Date: Wed, 22 Jun 2011 14:06:18 +1000
>> From: james.harper@xxxxxxxxxxxxxxxx
>> To: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>>
>> > >
>> > > The xenvbd driver doesn't do any timeout, windows does the timeout
>> and
>> > > tells xenvbd to reset. I haven't tested the scenario you describe
>> very
>> > > recently, and xenvbd is now two different drivers, one for scsiport
>> (<=
>> > > 2003) and one for storport (>= Vista), so there could be bugs in
>> either.
>> > >
>> >
>> > The bug can be reproduced in 2003 32bit system. We are using scsi
>> driver.
>> > I put some log in XenVbd_HwScsiResetBus to see if there are not
>> completed
>> > srb(Like below)
>> > but I didn't see the log when XenVbd_HwScsiResetBus called. So No IO
>> is in
>> > queue.
>>
>> Just to confirm, is this the issue that only happens when the migration
>> fails in xen and is cancelled?
>>
>>Exactly.
>>I've noticed some difference in log.
>
>In normal resuming, from the log, we can see event port assign like below:
>pdo_event_channel = 5 (Notifying event channel 5)
>suspend event channel = 6
>XEN_INIT_TYPE_EVENT_CHANNEL - event-channel = 7  (for VBD)
>XEN_INIT_TYPE_EVENT_CHANNEL - event-channel = 8  (VIF)
>
>>when guest resuming locally from suspend(that is migration failed in xen, guest
>>has already suspended, so it need resuming)
>
>>pdo_event_channel = 7 ( Notifying event channel 7)
>>suspend event channel = 8
>>XEN_INIT_TYPE_EVENT_CHANNEL - event-channel = 9 (vif)
>
>>VBD port is not allocated, since pdo is waiting fdo change.
>
>>It looks like port 5 and 6 is still occpuied, or pdo_event_channel bind twice?
>
>it works when I unbind pdo_event_channel & suspend_evtchn.
>===================================================================
>--- xenpci_fdo.c (revision 4304)
>+++ xenpci_fdo.c (working copy)
>@@ -656,6 +656,12 @@
>     }
>     WdfChildListEndIteration(child_list, &child_iterator);
>
>+    EvtChn_Unbind(xpdd, xpdd->pdo_event_channel);
>+    EvtChn_Close(xpdd, xpdd->pdo_event_channel);
>+
>+    EvtChn_Unbind(xpdd, xpdd->suspend_evtchn);
>+    EvtChn_Close(xpdd, xpdd->suspend_evtchn);
>+   
>     XenBus_Suspend(xpdd);
>     EvtChn_Suspend(xpdd);
>     XenPci_HighSync(XenPci_Suspend0, XenPci_SuspendN, xpdd);
>
>
>BTW, is there a missing "break" in XenVbd_HwScsiInterrupt,  xenvbd_scsiport.c:928
>before default? Well, it is harmless.
>
>924 case SR_STATE_RUNNING:
>925 KdPrint((__DRIVER_NAME " New pdo state %d\n", suspend_resume_state_pdo));
>926 xvdd->device_state->suspend_resume_state_fdo = suspend_resume_state_pdo;
>927 xvdd->vectors.EvtChn_Notify(xvdd->vectors.context, xvdd->device_state->pdo_event_channel);
>928 ScsiPortNotification(NextRequest, DeviceExtension);
>929 default:
>930 KdPrint((__DRIVER_NAME " New pdo state %d\n", suspend_resume_state_pdo));
>931 xvdd->device_state->suspend_resume_state_fdo = suspend_resume_state_pdo;
>932 xvdd->vectors.EvtChn_Notify(xvdd->vectors.context, xvdd->device_state->pdo_event_channel);
>933 break;
>
>Thanks.
>>> James
>>>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel