WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU

To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Subject: Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough
From: Dante Cinco <dantecinco@xxxxxxxxx>
Date: Thu, 18 Nov 2010 17:38:24 -0800
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, mathieu.desnoyers@xxxxxxxxxx, Andrew Thomas <andrew.thomas@xxxxxxxxxx>, Konrad Wilk <konrad.wilk@xxxxxxxxxx>, "Lin, Ray" <Ray.Lin@xxxxxxx>, keir.fraser@xxxxxxxxxxxxx, Chris Mason <chris.mason@xxxxxxxxxx>
Delivery-date: Thu, 18 Nov 2010 17:39:19 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=ANIMzlTRfIWt4/oS7tFQzjIahPLs9KfTEJwaxCboVGI=; b=aP3Y9VdkyQPHgxFn1Kw+ojqBtaQv1vftw92gkA4/5bz84xWb7lccSxSOE5mCIbpjnF 7By3rxd7k0JiWzSLrU5J/PoRONFEmxJBHfJUhiJYt4kuGB4u7yRFV/S2yUS9o/IsfPKi M5BrcL0gm/weWuVx0BgC9GmoX3WXchZ7D987U=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=qPAWwlvreJ6tTNzmW2D4z6CXAguGOTJyjgWL0ziym0qwpFehkstDTq3tRbqZ6cbgnm E6lLhxeUsKoL+yWI/BxHRRJTIPh+s8InEE85oFPZSAtnAUr+/C/8DC8jQe6cd3NAusGK ang3bYDoPNmGKYEFfW4RzspSW6GbsEtcnHXx0=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <563ba65b-51ca-43d5-99a0-353988dad721@default>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <EB4C61A1A2501842A04B573FE42B14D60137593FF5@xxxxxxxxxxxxxxxxx> <563ba65b-51ca-43d5-99a0-353988dad721@default>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Thu, Nov 18, 2010 at 4:20 PM, Dan Magenheimer
<dan.magenheimer@xxxxxxxxxx> wrote:
>> We did suspect it, since our old setting was HZ=1000 and we assigned
>> more than 10 VCPUs to domU. But we don't see the performance difference
>> with HZ=100.
>
> FWIW, it didn't appear that the problems were proportional to HZ.
> Seemed more that somehow the pvclock became incorrect and spent
> a lot of time rereading the pvclock value.

We decided to enable lock stat in the kernel to track down all those
lock activities in the profile report. The first thing I noticed was
kmemleak was at the top of the list (/proc/lock_stat) so we disabled
kmemleak. This boosted our I/O performance to 119k IOPS (from 31k).

One of our developers (Bruce Edge) suggested killing ntpd so I did.
This resulted in another significant bump in I/O performance to 209k
IOPS. The question now is why ntpd? Is it the source of all or most of
those pvclock_clocksource_read in the profile report?

>
>> -----Original Message-----
>> From: Lin, Ray [mailto:Ray.Lin@xxxxxxx]
>> Sent: Thursday, November 18, 2010 2:40 PM
>> To: Dan Magenheimer; Dante Cinco; Konrad Wilk
>> Cc: Jeremy Fitzhardinge; Xen-devel; mathieu.desnoyers@xxxxxxxxxx;
>> Andrew Thomas; keir.fraser@xxxxxxxxxxxxx; Chris Mason
>> Subject: RE: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2
>> pvops domU kernel with PCI passthrough
>>
>>
>>
>> -----Original Message-----
>> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-devel-
>> bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Dan Magenheimer
>> Sent: Thursday, November 18, 2010 1:21 PM
>> To: Dante Cinco; Konrad Wilk
>> Cc: Jeremy Fitzhardinge; Xen-devel; mathieu.desnoyers@xxxxxxxxxx;
>> Andrew Thomas; keir.fraser@xxxxxxxxxxxxx; Chris Mason
>> Subject: RE: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2
>> pvops domU kernel with PCI passthrough
>>
>> In case it is related:
>> http://lists.xensource.com/archives/html/xen-devel/2010-
>> 07/msg01247.html
>>
>> Although I never went further on this investigation, it appeared to me
>> that pvclock_clocksource_read was getting called at least an order-of-
>> magnitude more frequently than expected in some circumstances for some
>> kernels.  And IIRC it was scaled by the number of vcpus.
>>
>> We did suspect it, since our old setting was HZ=1000 and we assigned
>> more than 10 VCPUs to domU. But we don't see the performance difference
>> with HZ=100.
>>
>> > -----Original Message-----
>> > From: Dante Cinco [mailto:dantecinco@xxxxxxxxx]
>> > Sent: Thursday, November 18, 2010 12:36 PM
>> > To: Konrad Rzeszutek Wilk
>> > Cc: Jeremy Fitzhardinge; Xen-devel; mathieu.desnoyers@xxxxxxxxxx;
>> > Andrew Thomas; keir.fraser@xxxxxxxxxxxxx; Chris Mason
>> > Subject: Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2
>> > pvops domU kernel with PCI passthrough
>> >
>> > I mentioned earlier in an previous post to this thread that I'm able
>> > to apply Dulloor's xenoprofile patch to the dom0 kernel but not the
>> > domU kernel. So I can't do active-domain profiling but I'm able to do
>> > passive-domain profiling but I don't know how reliable the results
>> are
>> > since it shows pvclock_clocksource_read as the top consumer of CPU
>> > cycles at 28%.
>> >
>> > CPU: Intel Architectural Perfmon, speed 2665.98 MHz (estimated)
>> > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a
>> > unit mask of 0x00 (No unit mask) count 100000
>> > samples  %        image name               app name
>> > symbol name
>> > 918089   27.9310
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           pvclock_clocksource_read
>> > 217811    6.6265  domain1-modules          domain1-modules
>> > /domain1-modules
>> > 188327    5.7295  vmlinux-2.6.32.25-pvops-stable-dom0-5.7.dcinco-
>> debug
>> > vmlinux-2.6.32.25-pvops-stable-dom0-5.7.dcinco-debug
>> > mutex_spin_on_owner
>> > 186684    5.6795
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           __xen_spin_lock
>> > 149514    4.5487
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           __write_lock_failed
>> > 123278    3.7505
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           __kernel_text_address
>> > 122906    3.7392
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           xen_spin_unlock
>> > 90903     2.7655
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           __spin_time_accum
>> > 85880     2.6127
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           __module_address
>> > 75223     2.2885
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           print_context_stack
>> > 66778     2.0316
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           __module_text_address
>> > 57389     1.7459
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           is_module_text_address
>> > 47282     1.4385  xen-syms-4.1-unstable    domain1-xen
>> > syscall_enter
>> > 47219     1.4365
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           prio_tree_insert
>> > 46495     1.4145  vmlinux-2.6.32.25-pvops-stable-dom0-5.7.dcinco-
>> debug
>> > vmlinux-2.6.32.25-pvops-stable-dom0-5.7.dcinco-debug
>> > pvclock_clocksource_read
>> > 44501     1.3539
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           prio_tree_left
>> > 32482     0.9882
>> > vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-5.11.dcinco-debug
>> > domain1-kernel           native_read_tsc
>> >
>> > I ran oprofile (0.9.5 with xenoprofile patch) for 20 seconds while
>> the
>> > I/Os were running. Here's the command I used:
>> >
>> > opcontrol --start --xen=/boot/xen-syms-4.1-unstable
>> > --vmlinux=/boot/vmlinux-2.6.32.25-pvops-stable-dom0-5.7.dcinco-debug
>> > --passive-domains=1
>> > --passive-images=/boot/vmlinux-2.6.36-rc7-pvops-kpcif-08-2-domu-
>> > 5.11.dcinco-debug
>> >
>> > I had to remove dom0_max_vcpus=1 (but kept dom0_vcpus_pin=true) in
>> the
>> > Xen command line. Otherwise, oprofile only gives the samples from
>> > CPU0.
>> >
>> > I'm going to try perf next.
>> >
>> > - Dante
>> >
>> > _______________________________________________
>> > Xen-devel mailing list
>> > Xen-devel@xxxxxxxxxxxxxxxxxxx
>> > http://lists.xensource.com/xen-devel
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>