WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] xen-4.1: PV domain hanging at startup, jiffies stopped

To: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] xen-4.1: PV domain hanging at startup, jiffies stopped
From: Marek Marczykowski <marmarek@xxxxxxxxxxxx>
Date: Sun, 28 Aug 2011 15:13:46 +0200
Cc: Joanna Rutkowska <joanna@xxxxxxxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 29 Aug 2011 08:51:47 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110621 Fedora/3.1.11-1.fc14 Lightning/1.0b3pre Thunderbird/3.1.11
Hey,

I'm experiencing strange problem: non-deterministic PV domain hang, only
on some machines (with fast SSD drive). I've tried xen-4.1.0 and
xen-4.1.1 with many kernels different kernels:
VM:
 - 2.6.38.3 xenlinux based on SUSE package
 - vanilla 3.0.3
 - vanilla 3.1 rc2
dom0:
 - 2.6.38.3 xenlinux based on SUSE package
 - vanilla 3.1 rc2

Result always the same: sometimes VM hang at startup, SysRq-T shows
modprobe waiting in "wait_for_devices" (concretely schedule_timeout) and
jiffies counter not increasing between task-states dumps.

The only found thing (probably) connected with this problem are domU
kernel messages:
CE: xen increased min_delta_ns to 150000 nsec
(...)
CE: xen increased min_delta_ns to 4000000 nsec
CE: Reprogramming failure. Giving up

This messages doesn't exists in successful boot.

I've also tried some options to xen and domU kernel, but without success
(all combinations):
xen: tsc=unstable, cpufreq=none
domU: nohz=off, clocksource=tsc

Some combination of above options lowered frequency of problem (ex
tsc=unstable + nohz=off), but it happens quite often - like 1 of 15
boots fails.

Have you idea what is the cause and what can help?

Attached all relevant logs and configs:
xl-dmesg: xl dmesg after failed domU start
netvm-console-begin: kernel messages from failed domU
netvm-console-sysrq-t-1: first domU SysRq-T
netvm-console-sysrq-t-2: second domU SysRq-T
netvm.conf: domU config
xenstore-ls: result of xenstore-ls -fp
dom0-dmesg: dom0 kernel messages
config-xenlinux: 2.6.28.3 kernel config (same for dom0 and domU)
config-pvops: 3.1rc2 kernel config (same for dom0 and domU)

PS "script" prefix in domU vbd config is custom patch to libxl which
implement xend behaviour of using hotplug script for VBD setup.

-- 
Pozdrawiam / Best Regards,
Marek Marczykowski         | RLU #390519
marmarek at mimuw edu pl   | xmpp:marmarek at staszic waw pl

Attachment: config-pvops
Description: Text document

Attachment: config-xenlinux
Description: Text document

Attachment: dom0-dmesg
Description: Text document

Attachment: netvm.conf
Description: Text document

Attachment: netvm-console-begin
Description: Text document

Attachment: netvm-console-sysrq-t-1
Description: Text document

Attachment: netvm-console-sysrq-t-2
Description: Text document

Attachment: xenstore-ls
Description: Text document

Attachment: xl-dmesg
Description: Text document

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel