WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ia64-devel

[Xen-ia64-devel] [PATCH][RFC] performance tuning TAKE 7

To: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-ia64-devel] [PATCH][RFC] performance tuning TAKE 7
From: Isaku Yamahata <yamahata@xxxxxxxxxxxxx>
Date: Mon, 2 Oct 2006 16:23:03 +0900
Delivery-date: Mon, 02 Oct 2006 00:23:50 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-ia64-devel-request@lists.xensource.com?subject=help>
List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
List-post: <mailto:xen-ia64-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.4.2.1i
Hi. These patches are for performance tuning TAKE 7
Theses patches are for the changeset of xen-ia64-unstable.hg
11701:2bfd19fc1b79c6a6712c99f875f1fbf883af3f35

>From dom0 <-> domU benchmark result and counter based analysis,
xen/ia64 tlb flush overhead is successfully reduced with these patches.
However domU network performance is still low.
There might another issues somewhere else, I guess.
I'll suspend further investigation and want to merge these patches.
Then I'll move to xen oprofile and tlb miss issue (including huge
page if possible). Merging these patches would be done as background task.
If necessary, I'll be back to network performance again later.


benchmark
=========
I did netperf benchmark very roughly by netperf -c -C -H <netserver> -l 100.
This is to see the effects very roughly.
The network environment isn't separeted from others,
only it was measured only once and it seems that the distribution of
netperf figures is large.
If you need an accurate benchmark result, you should measure sometimes
and get avarate by yourself. (and let me know!)

* environment
tiger4 
  CPU: 4packge x 2core x 2HT
  Native: RHEL AS Release 4 Update 2: tiger4 
  dom0: tiger4, vcpu=4
  domU: tiger4-g0, vcpu=8
  NIC: e1000

em64t 
  CPU: Intel(R) Pentium(R) 4 CPU 3.60GHz stepping 0a 
  memory: 1GB
  NIC: Tigon3 [partno(BCM95789) rev 4101 PHY(5750)] 

* result
target <-> em64t(Mbits/sec)
target                          target -> em64t         em64t -> target
                                netperf   netserver     nerperf  netserver
Native                          723.33                  909.49
dom0(vanilla 11701)             673.96                  836.05
dom0(patched)                   675.75                  837.75
domU(vanilla 11701)             136.28                   77.78
domU(patched)                   249.32                  143.60


dom0 <-> domU in a same box (Mbits/sec)
                                domU -> dom0            dom0 -> domU
                                nerperf netserver       netperf netserver
vanilla xen(C/S 11701)          576.71                  329.49
patched                         973.99                  930.08


patches
=======
- performace counter
- p2m exposure
- per vcpu vhpt
- tlb tracking
  - grant table transfer 
  - netback skbuff preregister
  - netfront page preregister
  - netback page preregister
- deferred page freeing
- tlb flush clock
- micro optimize __domain_flush_vtlb_track_entry
- supress clear_pages


patch detail
============
- per vcpu vhpt
  It focuses on vcpu migration between physical cpus.
  With credit scheduler, vcpu is heavily migrated.
  This patch tries to reduce vTLB flush when vcpu is migrated.

- p2m exposure
  DMA paravirtualization requires the conversion from pseudo physical address
  to machine address. Currently it is done by hypercall.
  This patch tries to reduce the conversion overhead by read-only 
  mapping the xen p2m table to domain.

- tlb tracking
  It forcuses on grant table mapping.
  When page is unmapped, full vTLB flush is necessary.
  By tracking tlb insert on grant mapped page, full vTLB flush
  can be avoided.
  Especially vbd does only DMA, so dom0 doesn't insert tlb entry
  on the grant mapped page. In such case any vTLB flush isn't needed.
  
- netback skbuff/netfront/netback page tlb tracking
  This focuses on grant table transfer.
  When page is transfered, full vTLB flush is necessary on both 
  sender domain and receiver domain.
  By preregistering the page, Xen/IA64 begins to track tlb insert on 
  regestered pages.

- deferred page freeing
  When the page in which tlb insert isn't tracked is unmapped/zapped from
  domain, full vTLB flush is necessary again.
  Balloon driver and grant table page transfer is the case.
  This patch focuses on it.
  It tries to batch freeing/zapping page from domain in order
  to reduce full vTLB flush.
  modifies tlb track page hypercall semantics and
  reimplements tlb untrack page hypercall.
  This patch tries to reduce vTLB flush cost of
  tlb track/untrack/zap page hypercall by trying to batch using timer.

- tlb flush clock
  This is intended to be a counter part of Xen/x86 tlb flush clock.
  But this is used only when vcpu context switch only. not for lazy tlb flush.

included patches
================
11457:de77bfdecfbe_avoid_long_time_interrupt_masking.patch
11458:2bf4fc5ee839_perfc_for_vtlb_flush.patch
11459:dc1c8c91d249_perfc_mm_c.patch
11460:edbfec69d631_perfc_dom0vp_p2m_and_m2p.patch
11461:357d5479c0ff_p2m_exposure_xen_side.patch
11462:dde3a660f354_p2m_exposure_linux_side.patch
11463:1ae54e6b7ac9_p2m_exposure_test_module.patch
11464:065b48a99038_script_for_p2m_test_module.patch
11465:96b229487ae2_pervcpu_vhpt.patch
11466:da72199ba08c_fix_pte_flags_conflict.patch
11467:677fdf7aa2de_import_linux_hash.h.patch
11468:114c67d3d090_tlb_track.patch
11469:c5fde1737a9b_deferred_page_freeing.patch
11470:e123f0373d66_skbuff_tlb_tracking_xen_side.patch
11471:1313603b6f82_skbuff_tlb_tracking_linux_side.patch
11472:14a194e7caa9_tlb_track_netfront_page_xen_side.patch
11473:31a91097ca2b_tlb_tracking_on_netfront_page_linux_side.patch
11474:644d8aa4ce8f_tlbflush_clock.patch
11475:3debc96c950d_tlb_zap_page_hypercall_xen_side.patch
11476:a276174da6dd_tlb_zap_hypercall_linux_side.patch

FWIW my dot configs are as follows
- xen dot config
crash_debug=y
debug=y
verbose=y
xen_ia64_dom0_virtual_physical=y
xen_ia64_tlb_track=y
#xen_ia64_tlb_track_cnt=y
xen_ia64_tlb_track_cnt=n
xen_ia64_tlb_track_grant_table_page_transfer=y
xen_ia64_tlb_track_skbuff=y
xen_ia64_tlb_track_netfront_page=y
xen_ia64_tlb_track_deferred_flush=y
xen_ia64_pervcpu_vhpt=y
xen_ia64_deferred_free=y
xen_ia64_tlbflush_clock=y
xen_ia64_tlbflush_clock_tlb_track_entry=y
xen_ia64_clear_page=n

perfc=y
perfc_arrays=y

- Linux dot config includes
CONFIG_XEN_IA64_VDSO_PARAVIRT=y
CONFIG_XEN_IA64_EXPOSE_P2M=y
CONFIG_XEN_IA64_EXPOSE_P2M_USE_DTR=y
CONFIG_XEN_IA64_TLB_TRACK_SKBUFF=y
CONFIG_XEN_IA64_TLB_TRACK_NETFRONT_PAGE=y
CONFIG_XEN_IA64_TLB_TRACK_NETBACK_PAGE=y


thanks.
-- 
yamahata

Attachment: perf-tuning-take-7.tar.bz2
Description: Binary data

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
<Prev in Thread] Current Thread [Next in Thread>