This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] million cycle interrupt

To: "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] million cycle interrupt
From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date: Sun, 12 Apr 2009 20:16:35 +0000 (GMT)
Delivery-date: Sun, 12 Apr 2009 13:17:24 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
(I realize my last similar topic turned out to be a false alarm,
so I hope I am not "crying wolf" again.)

I'm still trying to understand some anomalous measurements in
tmem, particularly instances where the maximum cycle count
greatly exceeds the average.  It seems that tmem's compression
code is a magnet for interruptions.  This inspired me to
create a more controlled magnet, an "interrupt honeypot".
To do this, at every tmem call, I run a loop which does nothing
but repetitively read tsc and check the difference between
successive reads.  On both of my test machines, the measurement
is normally well under 100 cycles.  Infrequently, I get a "large"
measurement which, since xen is non-preemptive, indicates
a lengthy interrupt (or possibly that the tsc is getting moved
forward).  My code uses per_cpu to ensure that there aren't
any measurement/recording races (which were the issue with
my previous 10M "problem").

The result:  On my quad-core-by-two-thread machine, I frequently
get "large" measurements over 250000 cycles, with the max
just over 1 million (and actually just over 2^20).  Frequency
averages about one every 1-2 seconds, but the measurement
methodology makes it impossible to determine the true
frequency or spacing.  The vast majority of the "large" samples
are reported on cpu#0 but a handful are reported on other cpus.
This might also be methodology-related but the load is running
on 4 vcpus.

On the same machine, when I run with nosmp, I see no large
measurements.  And when I run the load with 1 vcpu, I see
a lower frequency (about one every ten seconds), but again
this could be due to the measurement methodology.

On my dual-core (no SMT) test machine, I see only a couple of
large measurements, 120438 cycles on cpu#0 and 120528 on cpu#1.
The same load is being run, though limited to 2 vcpus.

Is a million cycles in an interrupt handler bad?  Any idea what
might be consuming this?  The evidence might imply more cpus
means longer interrupt, which bodes poorly for larger machines.
I tried disabling the timer rendezvous code (not positive I
was successful), but still got large measurements, and
eventually the machine froze up (but not before I observed
the stime skew climbing quickly to the millisecond-plus

Is there a way to cap the number of physical cpus seen by Xen
(other than nosmp to cap at one)?


Xen-devel mailing list