xen-devel
Re: [Xen-devel] VT-d scalability issue
On Wed, Sep 10, 2008 at 10:27:12AM +0100, Espen Skoglund wrote:
> [Weidong Han]
> > Espen Skoglund wrote:
> >> Not regarding the other questions/objections in this thread for a
> >> moment --- what kind of performance improvments are we talking of here
> >> if the vcpus are pinned? Is it close to 1 VM or is there still some
> >> performance degradation due to IOTLB pressure?
>
> > Definitely performance will degrade due to IOTLB pressure when there are
> > many VMs which exhausts IOTLB.
>
> But how much of the degradation is due to IOTLB pressure and how much
> is due to vcpu pinning? If vcpu pinning doesn't give you much then
> why add the automatic pinning just to get a little improvement on
> older CPUs hooked up to a VT-d chipset?
Say, throughput of 1 pass-through domain is 100%,
if not pin vcpu, average throughput of 8 pass-through domain is 59%.
If pin vcpu, average is 95%.
So you can see how much vcpu pinning contribute to the performance.
>
> eSk
>
>
> > Randy (Weidong)
>
> >>
> >> [And talking of IOTLB pressure, why can't Intel document the IOTLB
> >> sizes in the chipset docs? Or even better, why can't these values be
> >> queried from the chipset?]
> >>
> >> eSk
> >>
> >>
> >> [Edwin Zhai]
> >>> Keir,
> >>> I have found a VT-d scalability issue and want to some feed backs.
> >>
> >>> When I assign a pass-through NIC to a linux VM and increase the num
> >>> of VMs, the iperf throughput for each VM drops greatly. Say, start 8
> >>> VM running on a machine with 8 physical cpus, start 8 iperf client
> >>> to connect each of them, the final result is only 60% of 1 VM.
> >>
> >>> Further investigation shows vcpu migration cause "cold" cache for
> >>> pass-through domain. following code in vmx_do_resume try to
> >>> invalidate orig processor's cache when 14 migration if this domain
> >>> has pass-through device and no support for wbinvd vmexit.
> >>
> >>> 16 if ( has_arch_pdevs(v->domain) && !cpu_has_wbinvd_exiting ) {
> >>> int cpu = v->arch.hvm_vmx.active_cpu;
> >>> if ( cpu != -1 )
> >>> on_selected_cpus(cpumask_of_cpu(cpu), wbinvd_ipi, NULL, 1,
> >>
> >>> }
> >>
> >>> So we want to pin vcpu to free processor for domains with
> >>> pass-through device in creation process, just like what we did for
> >>> NUMA system.
> >>
> >>> What do you think of it? Or have other ideas?
> >>
> >>> Thanks,
> >>
> >>
> >>> --
> >>> best rgds,
> >>> edwin
> >>
> >>> _______________________________________________
> >>> Xen-devel mailing list
> >>> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >>> http://lists.xensource.com/xen-devel
> >>
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >> http://lists.xensource.com/xen-devel
>
>
--
best rgds,
edwin
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|