WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] libxl: error handling before xenstored runs

On 10/02/11 08:55, Ian Campbell wrote:
That's the underlying bug which the heuristic is trying to avoid...

Fundamentally the xs ring protocol is missing any way to tell if someone
is listening on the other end so you have no choice but to try
communicating and see if anyone responds.

It's a pretty straightforward bug that the kernel does the waiting to
see if anyone responds bit with an uninterruptible sleep. I took a quick
look a little while ago but unfortunately it didn't look straightforward
to fix on the kernel side :-( I can't remember why though.

For starter, the protocol requires the messages to sit on the ring for a underdetermined amount of time (boot watches).

It might be simpler to support allowing the userspace client to
explicitly specify a timeout. I'm not sure what the impact on the ring
is of leaving unconsumed requests on the ring when the other end does
show up. Presumably the kernel driver just needs to be prepared to
swallow responses whose target has given up and gone home.

No, the simplest thing to do is to use the socket connection exclusively. Just how we're doing it in XCP and XCI.

The protocol is not design to do async either, so leaving unconsumed request, could be pretty disastrous if the other end show up. Providing the kernel doesn't detect it (i don't think it does [1]), it would imply spurious reply, for example the previous waiting read on "/abc/def" could reply to a next read on "/xyz/123".

Maybe we should add an explicit ping/pong ring message to the xs ring
protocol?

And who's going to reply to this if xenstored is missing ? you would require the kernel to introspect the messages and reply by itself.

[1] the kernel would be happy to read the previous reply on the ring after xenstored has put the actual reply after it, and trigger the eventchn. (the kernel could actually check the requestid and see if they match, but it doesn't.)

--
Vincent

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>