WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] Fix device removal on net and block frontend dri

To: Stefan Berger <stefanb@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] Fix device removal on net and block frontend drivers
From: Ewan Mellor <ewan@xxxxxxxxxxxxx>
Date: Mon, 21 Nov 2005 18:31:48 +0000
Cc: Murillo Fernandes Bernardes <mfb@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Mon, 21 Nov 2005 18:31:51 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <OF0DF85722.4C1769E2-ON852570C0.005B9C2A-852570C0.005C67C9@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <200511211433.31356.mfb@xxxxxxxxxx> <OF0DF85722.4C1769E2-ON852570C0.005B9C2A-852570C0.005C67C9@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.9i
On Mon, Nov 21, 2005 at 11:49:19AM -0500, Stefan Berger wrote:

> Murillo Fernandes Bernardes <mfb@xxxxxxxxxx> wrote on 11/21/2005 11:33:31
> AM:
> 
> > On Monday 21 November 2005 14:07, Stefan Berger wrote:
> > > xen-devel-bounces@xxxxxxxxxxxxxxxxxxx wrote on 11/21/2005 10:43:01 AM:
> > > > Frontend devices are not being unregistered when in closed state.
> The
> > > > following patch fix that.
> > > >
> > > > Fix bug #420.
> > > >
> > > > Makes "05_attach_and_dettach_device_repeatedly_pos" and
> > > > "09_attach_and_dettach_device_check_data_pos" tests pass.
> > >
> > > Did you test this with suspending / resuming a dom U? The reason I am
> > > asking is that when suspending the driver immediately gets into state
> > > 'Closed' and when resuming into state 'Connected',  but now your
> device is
> > > unregistered.
> >
> > No, I did not test suspend/resume.
> >
> > I really don't see why it should get into Closed on suspend, but anyway,
> is
> > this really hapenning? I could not find any switch to Closed into
> suspend's
> > code, neither on resume.

xenbus_read_driver_state returns Closed if the backend path is no longer
present.  Maybe this is where the Closed has come from.  However,
xenbus_probe.c:otherend_changed is supposed to be protecting us from watches
that have fired immediately after a resume.

Could you please enable the DPRINTK in xenbus_probe and see whether the Closed
is coming through the test in otherend_changed?  This would help diagnose the
problem.

The intention with xenbus_read_driver_state returning Closed was that this was
the correct way of forcing the driver to close down if the path goes away, as
in normal use the backend path should not just disappear, and for resumption
we have a way to detect that.  Perhaps one or other of these things should
change, but it's not clear to me which one it is, or if indeed this is the
problem at all.

> What I am seeing is that after a suspend / resume the interface 'eth0' is
> completely gone. 'ifconfig -a' shows everything, but no eth0.
> 
> You might only want to unregister if the domain was not suspended. So you
> probably need to implement the .suspend function in the frontend and set a
> state variable to know whether the domain is being hibernated, and you
> clear that variable in the .resume. You check that variable when the
> driver is going into the 'Closed' state and only unregister if not in
> 'suspend' mode.

If this is necessary, and it's not clear to me that it is, then this is a
facility that Xenbus should provide in general, rather than each driver having
to hack around the problem itself.

Returning to Murillo's patch, I assumed that the unregister_netdev in
close_netdev would implicitly call device_unregister, and that this was the
correct way to close down the device.  Is this not the case?

My intention for closedown of the device was that the backend would move to
state Closing, triggering a graceful shutdown of the frontend (in this case
through netfront_closing, close_netdev, etc.).  AFAIK, Xend is correctly
setting the backend to state Closing, so I expect unregister_netdev to be
being called.

There is the different issue that Xend does not check for the existence or
state of a device before hotplugging a new one.  This means that the frontend
might not have time to see the Closing before having a chance to close down,
for example.  This is a problem with Xend that needs to be fixed there.  Xend
should refuse to hotplug a device if the frontend for the old one has not yet
closed down.  This is not to say that Murillo's patch is wrong, but simply to
say that I expect wider issues than can be fixed by this patch alone.

Ewan.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel