WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] Use timeout on xenstore read_reply to avoid task

On Wed, 2011-03-02 at 11:35 +0000, Frank Pan wrote:
> Oh sorry.
> Patch attached.
> 
> On Wed, Mar 2, 2011 at 6:34 PM, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:
> > It all sounds very plausible to me but you've forgotten the patch ;-)
> >
> > Why wait_event_timeout and not wait_event_interruptible to allow users
> > to interrupt? In particular I'm concerned about the arbitrarily chosen
> > 5s timeout not being sufficient on a busy system.
> 
> FP: I wait_event_interruptible is a choice. But it needs user
> operation such as kill command. User-level tool(xenstore-ls for
> example) can also set SIGALRM or something else, but it sounds not so
> good.

I think this is preferable to requiring all clients be reworked to
handle a timeout event which they aren't currently expecting.

Rather than have all xenbus clients expect and handle an arbitrary
timeout we should provide a new interface to in-kernel users of xenbus
which includes a timeout parameter, e.g. foo_timeout. (assuming there
are in kernel users who could do anything sane with a timeout, if not
then we shouldn't bother with the interface).

For userspace clients I think we would probably be better off making
sure that poll/select work properly than trying to find a way to expose
timeouts of this sort, that combined with making the sleeps
interruptible, would cover the problems we care about, I think.

> The timeout parameter is something discussible. 5s may not be a good
> one, but I believe xenstored on a healthy system should be response in
> 5s. What do you think?

I've definitely seen cases in heavily loaded domain 0 where it has taken
longer than 5s (think of a system which is starting dozens of new
domains concurrently for example)

> > Once specific pitfall which I remember was that userspace clients are at
> > liberty to make use of the req_id themselves (and some do). Fixing this
> > might involve shadowing the user provided req_id with a kernel generated
> > ID on the ring and unshadowing in the responses...
> 
> FP: Yes, that's what I supposed to do. But I cannot find any
> dereference on the req_id section of the reply msg. If it exist
> somewhere, shadowing is surely needed.

Userspace clients communicating via the drivers/xen/xenfs/xenbus.c
driver are at liberty to use the req_id field for their own purposes.

For example, the ocaml xenbus bindings in
xen-unstable.hg:tools/ocaml/libs/xb expose this field and I think
(although I'm not 100% sure) that the oxenstored (tools/ocaml/xenstored)
uses it.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel