This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-users] Re: Xen, LVM, DRBD, Linux-HA

Ross S. W. Walker wrote:
Please do not hijack threads, please put all your drbd questions
to the drbd-user list.

This is a xen issue.

I'm pretty sure its not a problem with drbd itself but with drbd under xen and probably with a clash between xen networking and disk IO.

For people who are trying to use drbd under xen this is an important issue.

If properly setup and maintained drbd should work properly.

I've been checking and many of our xen servers are showing similar packet loss under disk IO load. However, so far, only drbd shows such catastrophic problems.

Please advise as to how to configure such that disk IO does not kill the network, as this *seems* to be what is happening here.

-----Original Message-----
From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Steve Wray
Sent: Tuesday, April 22, 2008 4:19 PM
To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] Re: Xen, LVM, DRBD, Linux-HA

I am going to reply to this thread but I'm going to start from something new as it doesn't seem to be covered in this thread so far.

I have been testing drbd under Xen and found some very disturbing things.

I'd like to implement this in a production system but this scares the hell out of me...

I have two Dom0 servers connected with a crossover cable between two gigabit e1000 NICs. No switch involved.

One DomU on each server with a 20G drbd device shared between them.

The drbd config contains:

   syncer {
     rate 10M;
     group 1;
     al-extents 257;

   net {
     on-disconnect reconnect;

so the net section is working at defaults. At first I had thought that the problems I was seeing was due to timeout values etc and tried various parameters in the net section but nothing made any difference.

When, on the current secondary node, I execute

drbdadm invalidate all

I get frequent errors such as:

drbd0: PingAck did not arrive in time.
drbd0: drbd0_asender [1572]: cstate SyncSource --> NetworkFailure
drbd0: asender terminated
drbd0: drbd_send_block() failed
drbd0: drbd0_receiver [1562]: cstate NetworkFailure --> BrokenPipe
drbd0: short read expecting header on sock: r=-512
drbd0: worker terminated
drbd0: ASSERT( mdev->ee_in_use == 0 ) in /usr/src/modules/drbd/drbd/drbd_receiver.c:1880
drbd0: drbd0_receiver [1562]: cstate BrokenPipe --> Unconnected
drbd0: Connection lost.

I observe the xm top in both Dom0's and I note a HUGE amount of dropped RX packets being reported on both DomU's vif interfaces. The dropping of RX packets is continuous throughout the drbd resync and grows extremely large.

The ifconfig output within the DomU's do not show any dropped packets.

I have used iperf to test the performance of the crossover link and it is fine when there is no drbd syncing going on.

I have tried various things such as setting sysctl.conf options:


net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

but so far the only thing that prevents the "PingAck did not arrive in time" errors is to take the sync rate down to 1M.

My Xen version info is:

Xen version 3.0.3-1 (Debian 3.0.3-0-4)

Please advise...


Xen-users mailing list

This e-mail, and any attachments thereto, is intended only for use by
the addressee(s) named herein and may contain legally privileged
and/or confidential information. If you are not the intended recipient
of this e-mail, you are hereby notified that any dissemination,
distribution or copying of this e-mail, and any attachments thereto,
is strictly prohibited. If you have received this e-mail in error,
please immediately notify the sender and permanently delete the
original and any copy or printout thereof.

Xen-users mailing list

Xen-users mailing list