This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] [PATCH] Use timeout on xenstore read_reply to avoid task hun

To: xen-devel@xxxxxxxxxxxxxxxxxxx, Chris Wright <chrisw@xxxxxxxxxxxx>, Jeremy Fitzhardinge <jeremy@xxxxxxxxxxxxx>
Subject: [Xen-devel] [PATCH] Use timeout on xenstore read_reply to avoid task hunging
From: Frank Pan <frankpzh@xxxxxxxxx>
Date: Wed, 2 Mar 2011 18:20:33 +0800
Delivery-date: Wed, 02 Mar 2011 02:24:15 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:from:date:message-id:subject:to :content-type; bh=bdT6C/X5TVNoQqkuNxLb9qKSjN2dIgccJrywOnh3JF0=; b=ZfIhNNzfaz2050p9cDMXFg04CfO3WuvmGbzG54C8Zzdxzp8LcR6ihX4p9dwRlTPQjQ XAqBMWP4yfo+FOaI0VCMXXPKqLjIHgnEDEZWlrXxHDZKXwJcs+JTRpFElhJySPvqfETA aayO1n5Yv0fnyduJP+y1ltp+qhAp3ycU8ZZNk=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=pNkP3rdikiaYaLs+3P/9bT2VlJZtmD6hXrzc+VcWnqzARMp8H5OFvJ7c91jBRzTHCn nNOMkrEQnJ8IEZGvs2RPV+ErvayA/H1Cs2f9oUkRd/Mj6tVPr9svPRlL8ngs8LJEiZ48 V42G66VgymoxckaPA/PZNii+qtSdi+7NEux8I=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Recent pvops kernel uses wait_event on waiting reply from xenstored.
This may cause xenstore clients hung inside kernel when xenstored does
not response correctly. The hung even happens when xenstored is not
running, and may confuse the developer.

The patch attached uses wait_event_timeout instead, and return -EIO to
userspace if xenstored does not response in 5 seconds.

Simply change wait_event to wait_event_timeout is not correct. Right
after the xenstored starts working, the requests abandoned before will
be processed by xenstored, and responses sent to the reply_list queue
will confuse the requests later. The patch also makes use of the
req_id section in struct xsd_sockmsg, as a sequence id. This avoids
the confusing of responses by abandoned requests.

Any suggestions? Thanks.

Frank Pan

Computer Science and Technology
Tsinghua University

Xen-devel mailing list