WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] libxl: error handling before xenstored runs

On Thu, 2011-02-10 at 08:55 +0000, Ian Campbell wrote:
> On Wed, 2011-02-09 at 17:39 +0000, Gianni Tedesco wrote: 
> > On Wed, 2011-02-09 at 15:54 +0000, Christoph Egger wrote:
> > > On Wednesday 09 February 2011 16:52:08 Christoph Egger wrote:
> > > > On Wednesday 09 February 2011 16:42:21 Kamala Narasimhan wrote:
> > > > > >>> I'm currently on c/s 22834. Which c/s added the check you are 
> > > > > >>> talking
> > > > > >>> about?
> > > > > >>
> > > > > >> http://xenbits.xen.org/staging/xen-unstable.hg?rev/eefb8e971be5
> > > > > >
> > > > > > This is c/s 22806. So my tree is new enough.
> > > > >
> > > > > Right, but did you happen to check how you got past the check done by
> > > > > that patch for the case in question?
> > > >
> > > > The pid file simply doesn't exist.
> > > 
> > > Oh wait. Hit the 'send' button too fast.
> > > 
> > > The pid file does exist from previous boot.
> > 
> > Bleh, precisely my problem with these heuristic checks. It's worse on my
> > box because if this happens I end up with unkillable xl processes due to
> > libxenstore wanting to open /dev/xen/xenbus or whatever it is.
> 
> That's the underlying bug which the heuristic is trying to avoid...
> 
> Fundamentally the xs ring protocol is missing any way to tell if someone
> is listening on the other end so you have no choice but to try
> communicating and see if anyone responds.
> 
> It's a pretty straightforward bug that the kernel does the waiting to
> see if anyone responds bit with an uninterruptible sleep. I took a quick
> look a little while ago but unfortunately it didn't look straightforward
> to fix on the kernel side :-( I can't remember why though.

I suppose it's because we don't want to be killable after sending the
message but before receiving the reply, since the ring is going to get
jammed up due to nobody consuming the reply. The reply that in this case
never comes, but the kernel can't know that it won't eventually come,
right?

Gianni


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel