|
|
|
|
|
|
|
|
|
|
xen-devel
RE: [Xen-devel] Poor HVM performance with 8 vcpus
> With a specific benchmark producing a rather high load on memory
> management
> operations (lots of process creation/deletion and memory allocation) the 8
> vcpu performance was worse than the 4 vcpu performance. On other platforms
> (/390, MIPS, SPARC) this benchmark scaled rather well with the number of
> cpus.
>
> The result of the usage of the software performance counters of XEN seemed
> to point to the shadow lock being the reason. I modified the Hypervisor to
> gather some lock statistics (patch will be sent soon) and found that the
> shadow lock is really the bottleneck. On average 4 vcpus are waiting to
> get the lock!
At various points in the shadow pagetable code, xen needs to be able to find
all the writeable mappings (PTEs) to a particular page. Rather than storing a
data structure to enable the frame number to list of PTEs lookup, we've found
that it is generally quicker to use a heuristic. The heuristic knows where to
look to find writeable mappings in a number of common OSes. For example, it
knows to look in the direct mapped (1:1) kernel address regions in linux, or
the recursive linear mapping in windows. If application of the heuristics
fails, xen resorts to a brute force search.
Unless BS2000 just happens to use the exact same virtual memory layout as any
of the other supported OSes, the heuristic will be failing. The brute force
search is rather slow, which will result in the shadow lock being held for an
extensive period, resulting in lock conveys on SMP guests.
The quick fix is to add a heuristic for BS2000. However, the list of heuristics
is getting a bit unmanageable, and they're currently dumbly tried in-order.
Given the user-base size of BS2000, Keir is likely to insist the heuristic for
BS2000 is the last to be tried :)
At the very least it would be good to have a predictor which figured out which
of the several heuristics should actually be used for a given VM. A simple "try
whichever one worked last time first" should work fine.
Even smarter would be two just have heuristics for the two general classes of
mapping (1:1 and recursive), and have the code automatically figure out the
starting virtual address being used for a given guest.
All fun stuff.
Ian
> Is this a known issue?
> Is there a chance to split the shadow lock into sub-locks or to use a
> reader/writer lock instead?
> I just wanted to ask before trying to understand all of the shadow code :-
> )
>
>
> Juergen
>
> --
> Juergen Gross Principal Developer Operating Systems
> TSP ES&S SWE OS6 Telephone: +49 (0) 89 636 47950
> Fujitsu Technolgy Solutions e-mail:
> juergen.gross@xxxxxxxxxxxxxx
> Otto-Hahn-Ring 6 Internet: ts.fujitsu.com
> D-81739 Muenchen Company details:
> ts.fujitsu.com/imprint.html
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|