This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] 10 million cycles disappearing

To: "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] 10 million cycles disappearing
From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date: Tue, 7 Apr 2009 23:54:01 +0000 (GMT)
Delivery-date: Tue, 07 Apr 2009 16:54:36 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
I've been seeing a possible performance problem off and on
and I've spent some time tracking it but haven't made much
progress and have to give up for now, so I thought I'd at
least document what I know and see if it sounds familiar
to anyone.

The problem: Something in Xen seems to periodically take about
10M cycles.  I think it is an interrupt and I think it is
taking a lock related to memory allocation and holding
it for a LONG time (i.e. 10M cycles or close).

I am measuring inside a hypercall using TSC, taking a TSC
reading at entry to the hypercall code and at exit.  Xen
is not pre-emptive, so it can't be switching context or
something, right?  Nearly all of the readings are less
than 100K cycles, but some samples are "huge" and
usually at 9M-10M cycles.  Since I am recording the max
difference between the TSCs, the max "huge" grows over
a long period of time, but eventually converges close
to 10M (and this is a 3Ghz processor).  I can see
it grow using "watch".  And I've NEVER seen a reading
over 10M.

I am able to disable interrupts and still take
measurements.  Roughly half of the measurements
occur when doing a hypercall-subop that does no
memory allocation and roughly half occur when doing
a hypercall-subop that DOES do memory allocation.
With interrupts disabled, the subop that DOES
memory allocation still asymptotically approaches
10M.  The one that does NOT do memory allocation,
stays relatively small.

I'm currently measuring on Xen 3.3.1 but I think I've
seen similar results on xen-unstable.  A single 2-vcpu
domain is running (in addition to domain0).

Does any of that sound familiar?  Any smoking guns?

Xen-devel mailing list