Re: [Xen-devel] [PATCH] turn off writable page tables

Keir Fraser wrote:

On 31 Jul 2006, at 10:32, Zachary Amsden wrote:
It would allow set_pte() to switch between explicit queuing and'direct' writing. We moved away from the former a few years back asdoing it everywhere made a mess of the generic Linux mm code and itwas hard to reason whether our patches were correct. I guess doingit for the most important subset of mm routines is not so bad. It'sa shame that, although many set_pte() call sites could determinestatically whether or not they will batch, we'd end up with adynamic run-time test everywhere (unless I'm mistaken) -- I wonderif that has a measurable cost?
We've actually seen a benefit for this, despite the cost of thenon-static parameters, for both VMI Linux with shadow pagetables onESX and VMI Linux with direct pagetables on Xen. Turns out that aslong as the call EIP is predictable, the parameters do notnecessarily need to be so, and modern processors are getting muchbetter at branch prediction.
You mean that the benefit of batching outweighs the cost of an extratest-and-branch in the middle of a loop, or that the extratest-and-branch simply has unmeasurable overhead? The former is to beexpected, but I'd be worried about other call sites where batchingdoes not happen, and an effect on those.

The extra test-and-branch has unmeasurable overhead. In theimplementation we had chosen, there was already a branch requirement onthe set_pte call anyway, to potentially delay the pte update so that itcan piggyback onto a page invalidation with just one hypercall.Combining the two branches into one is trivial, and the cost of oneextra branch here seems to be invisible. We were getting better numbersfor MMU related workloads with VMI-Linux than XenoLinux was. I don'thave hard numbers on this, and even if I did, it would take some time toget them approved for public distribution. For that I must apologize.But avoiding the changes that would otherwise be required - a full setof pte and tlb functions which could be delayed, as well as combiningthe pte update and invlpg into a single call - seemed worth a singlebranch. I'm not even convinced these changes can be done in a way thatwould be safe for all architectures. Of course, I may be wrong on thatpoint - but there is no simple way I see to do it that affords thestrong reasoning about correctness that the enter / leave semantic does.

Doing explicit batching exactly where it counts, under protection oflocks, so that SMP safety is guaranteed turns out to be really easy,as well as a nice win.
If the run-time check cost really isn't an issue (I'd like to seenumbers), we'd likely use this new interface in preference toimplicitly batched writable pagetables and would support its inclusionin the kernel.

Sorry about not having numbers. My biggest question is - do you needany other information than simply a single state variable to useexplicit batching? I thought, and Jeremy and Chris both pointed out aswell, that Xen could potentially use the information about which PT tounhook to take advantage of writable pagetables. But, if that is notthe direction you are going, then it seems this information is not sorelevant for the explicit batching. The explicit batching does have onedisadvantage without writable page tables, which is a potential longterm maintenance / correctness issue - you must remove read hazards fromthese encapsulated paths. That is not so hard to do, and not a largegeneral problem, because the batching is explicit rather than implicit,so you can pick paths to batch that are small, compact, and easy toreason about. But nevertheless, a point I would like to make sure youare comfortable with before we all decide these hooks will work foreveryone.


Zach

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] [PATCH] turn off writable page tables