This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] dom0 serial input overruns

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] dom0 serial input overruns
From: Ferenc Wagner <wferi@xxxxxxx>
Date: Sun, 20 Mar 2011 12:02:08 +0100
Delivery-date: Sun, 20 Mar 2011 04:03:18 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux)
(no dice on xen-users, let's try xen-devel...)


I'm running a HA Xen cluster, where the dom0s are crosslinked via a null
modem serial cable for heartbeat redundancy.  This works most of the
time, but the serial connection is very unreliable, dropping characters
all the time, with lot of messages like "ttyS0: 2 input overrun(s)" in
dmesg.  No such problem when running the same kernel on bare metal.  The
link is running at 9600 baud, so the system should easily cope, but it
looks like the serial interrupt isn't serviced timely enough under Xen.
I'm running Xen 4.0.1 now with kernel 2.6.32 (stock Debian squeeze), but
the problem isn't specific to this setup, Xen 3.2 with kernel 2.6.18 had
much the same issue.  Raising dom0's scheduling weight didn't help much
(or at all), pinning all domUs to CPU1-3 and vcpu0 of dom0 to CPU0
actually made the problem worse.

$ cat /proc/interrupts 
            CPU0       CPU1       CPU2       CPU3       
   1:          9          0          0          0  xen-pirq-ioapic-edge  i8042
   4:   31712520          0          0          0  xen-pirq-ioapic-edge  serial

Is there some known solution to this problem?  It feels like overly big
dom0 interrupt latency... maybe caused by the single-threaded hypervisor?
Comments more than welcome!

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>