WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] [PATCH - proposed] XI Shadow Page Table Mechanism]

To: "Robert S. Phillips" <rphillips@xxxxxxxxxxxxxxx>
Subject: [Xen-devel] [PATCH - proposed] XI Shadow Page Table Mechanism]
From: zhu <vanbas.han@xxxxxxxxx>
Date: Fri, 30 Jun 2006 10:37:16 +0800
Cc: Xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Thu, 29 Jun 2006 19:38:06 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <44A3F073.50305@xxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <44A3CA74.4000206@xxxxxxxxxxxxxxx> <44A3D3CE.3040408@xxxxxxxxxxxxxxx> <44A3E08B.9090103@xxxxxxxxx> <44A3E98B.9040602@xxxxxxxxxxxxxxx> <44A3F073.50305@xxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 1.5.0.4 (Windows/20060516)
Hi, Robert,
I found out another confusing code snippet:
in void xi_invl_mfn(struct domain *d, unsigned long mfn)
    if (ext && pfn < ext->large_page_aligned_size)

According to the code, it should be
    if (ext && (pfn>>SPT_ENTRIES_ORDER) < ext->large_page_aligned_size)

If I made any mistake, please point it out.
_______________________________________________________
Best Regards,
hanzhu


Robert S. Phillips дµÀ:
Hi Han,
Yes it is a bug but of limited impact. The effect is that we will fail to keep the 'accessed' flag up-to-date but only for L3 guest page table entries in 64-bit guests. Still I'm very pleased that you spotted it.
-- rsp

Robert S. Phillips wrote:
Hi Han,
I think you're right. I'll get back to you soon after we review the ramifications.
Thanks!
-- rsp
p.s. Out of curiosity ... where do you work?

han wrote:
Hi Robert,
Thanks a lot for your thorough explanation.
However, in dt_init_action I notice this code snippet:
+            int _32pae_l3 = c_nr_levels <= 3 && c_level == 3;
Seems c_nr_levels is always <= 3, isn't it?
I'm afraid the code means that c_nr_levels == 3.

I suppose the c_nr_levels should be 3 if the guest is 32bit(pae or not), and c_nr_levels should be 4 if the guest is 64bit. So I mentioned you about the code for uncombine_nr_levels.


_______________________________________________________
Best Regards,
hanzhu


Robert S. Phillips дµÀ:
Hello Han,

The puzzling function, uncombine_nr_levels(), should be reviewed together with the previous two functions: combine nr_levels() and uncombine_level(), shown below. The problem I'm attempting to solve is to minimize the size of the decision table. It is currently 4096 entries.

The decision function depends on "level" (that is, the level of the current shadow page table) and, to a lesser extent, on "nr_levels" (that is, the number of guest paging levels). It really only cares if nr_levels is 3 or not, which is why uncombine_nr_levels() returns 3 or not 3.

Normally each of the two values can each have three values (2, 3, or 4) making 9 combinations. I squeeze them down to 4 combinations, which cuts the decision table size by more than half.

The function combine_levels() is used to encode nr_levels and levels into a number from 0 to 3.

The functions uncombine_nr_levels() and uncombine_level() do the inverse, sort of:
uncombine_level() returns the number of levels (1, 2, 3, or 4)
and uncombine_nr_levels() returns 3 if the number of guest paging levels was 3, 0 otherwise. This is all dt_init_action() needs to initialize the decision table. It only tests if c_nr_levels is 3.

I hope this explanation helps.

-- rsp

/* Combine nr_levels and level into 4 combinations by
* taking advantage of the fact that the rules for PTE levels
* 3 and 4 are identical
* except that the PDPT (ie nr_level == 3, level == 3)
* is slightly irregular. */
static inline int combine_levels(int c_nr_levels,
                                int c_level)
{
   switch (c_level) {
   case 1: return 0;
   case 2: return 1;
   case 3: return c_nr_levels == 3
           ? 2   /* PDPT  */
           : 3;  /* treat like level 4*/
   default:      /* the compiler needs this */
   case 4: return 3;
   }
}

static inline int uncombine_level(int combo)
{
   return combo + 1;
}

static inline int uncombine_nr_levels(int combo)
{
   return combo == 2 ? 3 : 0;
}

Ben Thomas wrote:


-------- Original Message --------
Subject: Re: [Xen-devel] [PATCH - proposed] XI Shadow Page Table Mechanism
Date: Thu, 29 Jun 2006 20:25:13 +0800
From: han <vanbas.han@xxxxxxxxx>
To: Ben Thomas <bthomas@xxxxxxxxxxxxxxx>
References: <44A055FA.1010405@xxxxxxxxxxxxxxx>

Hi, Ben
I'm learning about your patch for the rewritten shadow pagetable code.
However, I feel confused about some part of your code:
+static inline int uncombine_nr_levels(int combo)
+{
+    return combo == 2 ? 3 : 0;
+}
shouldn't it be like below snippet:
+static inline int uncombine_nr_levels(int combo)
+{
+    return combo <= 2 ? 3 : 4;
+}

_______________________________________________________
Best Regards,
hanzhu


Ben Thomas дµÀ:
A post last week contained the design document for a shadow page
table mechanism. This post contains more information, and a pointer
to the code.

As a recap, the "XI Shadow Mechanism" is a design for shadow
page table code for fully virtualized HVM domains running on a 64-bit
Xen hypervisor.

This work was undertaken to address a number of goals.  These are
enumerated in the document posted last week and include:

- ability to run fully virtualized 32, 32PAE and 64 bit guest
  domains concurrently on a 64-bit hypervisor

- start support of live migration of fully virtualized domains

- provide good performance and robustness

- limit size and scope of change to Xen hypervisor

This design has been coded, and tested on a variety of operating
systems. I've included this weekend's test report.

Of particular interest from the following table are the Average Time
comparisions (smaller is better):

pfault   10.56 vs 2.43 (XI)
sync_all 39.72 vs 33.51 (XI)
invl_pg  22.75 vs 4.00 (XI)

The patch is too large for the mailing list, and may be obtained from

http://s3.amazonaws.com/VIDownloads/xi-20060626.patch

The patch applies against changeset 10352 (and possibly others)

We look forward to any comments and suggestions.

Thanks,
-b



XI Shadow Test Report

This report shows a comparsion of test runs done on xen-unstable changeset 10352 with and without the XI patch applied. It also compares pfault,
sync_all, and invl_pg performance.

With the XI patch 32bit SMP (PAE) guests now boot and pass the regression tests, closing Bug #661. With the XI patch 64bit SMP guests are much more stable and are no longer experiencing SEGVs and GPFs, closing Bug #678.

---------------------------------------------------------------------- | XEN | Guest Kernel SMP kernels booted with 2 CPUs | | Changeset|-----------------------------------------------------------| | | 32bit Non-XI | 32bit XI | 64bit Non-XI | 64bit XI | | |--------------|--------------|--------------|--------------| | | Boot | Test | Boot | Test | Boot | Test | Boot | Test | |----------|------|-------|------|-------|------|-------|------|-------| | 10352 | Fail | | Pass | 852/1 | Pass | 852/0 | Pass | 852/0 | | | (1) | | | (2) | |(3,4,5)| |(3,4) | ----------------------------------------------------------------------

1. BUG 661: 32bit SMP guest crashes on boot:
   "(XEN) __hvm_bug at vmx.c:2289"
2. BUG 666: 32bit UP guest fail ltp gettimeofday02:
    "Time is going backwards"
3. BUG 663: 64bit SMP guest report these messages while running:
    "(XEN) spurious IRQ irq got=-1"
4. BUG 662: 64bit SMP guest report this message in ltp pth_str01/2/3:
"(XEN) <vlapic_accept_irq>level trig mode repeatedly for vector 252" 5. BUG 678: 64bit 2CPU SMP guest failed several ltp tests with SEGVs or GPF "sshd[23654] general protection rip:2a9555d426 rsp:7fbfffd000 error 0"

The XI patch can be enabled or disabled at build time and includes some performance statistics that can be examined whether or not it's enabled. This allows us to compare the performance of XI vs the existing shadow code.
You access the statistics from the XEN console (^a^a^a). Typing a 'Y'
clears the counters and a 'y' displays the counters since the last clear.
Here is a comparsion of XI vs Non-XI shadow code done on xen-unstable
changeset 10352. The guest is a RHEL4U2-64bit 2 CPU SMP 256MB Guest running a usex -b24 load for 5 minutes. Shadow statistics cleared at start of run.

(XEN) Non-XI shadow performance statistics for the last 303 seconds
(XEN) Op type # of Operations Total Avg Tick Profile (XEN) <2usec <3usec <4usec <8usec <16usec <32usec <64usec >=64usec ops/msec usec In Op Locked Total (XEN) pfault 6750685 1128922 624988 3249437 3833201 2863229 1224908 315992 19991362 10.56 0 0 0 (XEN) 33.7% 5.6% 3.1% 16.2% 19.1% 14.3% 6.1% 1.5% 211175 (XEN) sync_all 8176 380 333 159011 167619 305470 814598 298411 1753998 39.72 0 0 0 (XEN) 0.4% 0.0% 0.0% 9.0% 9.5% 17.4% 46.4% 17.0% 69670 (XEN) invl_pg 219 17 114 23333 40087 21904 24622 4639 114935 22.75 0 0 0 (XEN) 0.1% 0.0% 0.0% 20.3% 34.8% 19.0% 21.4% 4.0% 2615

(XEN) XI shadow performance statistics for the last 302 seconds
(XEN) Op type # of Operations Total Avg Tick Profile (XEN) <2usec <3usec <4usec <8usec <16usec <32usec <64usec >=64usec ops/msec usec In Op Locked Total (XEN) pfault 23078414 4694139 485499 613419 396088 1427265 11976 8228 30715028 2.43 0 0 0 (XEN) 75.1% 15.2% 1.5% 1.9% 1.2% 4.6% 0.0% 0.0% 74690 (XEN) sync_all 1734 97 169 305433 484991 174300 150199 130405 1247328 33.51 0 0 0 (XEN) 0.1% 0.0% 0.0% 24.4% 38.8% 13.9% 12.0% 10.4% 41810 (XEN) invl_pg 10169 302977 125270 44927 5385 17603 776 1656 508763 4.00 0 0 0 (XEN) 1.9% 59.5% 24.6% 8.8% 1.0% 3.4% 0.1% 0.3% 2037







_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>