>From: Tim Deegan
>Sent: 2008年1月22日 22:52
>
>At 21:23 +0800 on 22 Jan (1201037017), Tian, Kevin wrote:
>> Anyway, my points are:
>> Shadow lock is just the lock for shadow page tables,
>> which can't be used to prevent guest page table race. Even
>> when guest table walk is included in shadow lock, there's no
>> determined result for insane guest behavior.
>
>Agreed. I'm willing to believe that the walks can be done without the
>lock held (just didn't want to assert it was safe without finding time
>to look more closely).
Fair enough. :-)
>
>But the current semantics of the shadow lock also includes maintaining
>the invariant that a shadow entry is always a valid shadow of the guest
>entry it represents (or not present).
That's an important point I may overlook before...
>
>It's relied on for level-2 and higher entries in the pagefault handler
>-- if there is a shadow pagetable present mapping a particular
>VA range,
>it is assumed that it is the shadow of the guest pagetable. For
>example, using a 2-level system for simplicity:
>
>- Guest l2e in slot 0 is 0x00001067 (GFN 0x1 is the gl1 table
>for VA 0x0)
> Shadow l2e in slot 0 is 0x0000a067 (MFN 0xA is the sl1 table
>for VA 0x0)
>
>- Guest vcpu 1 writes 0x00002067 into l2e slot 0. This write is
> emulated, and the shadow lock is not held when the guest pagetable is
> written to in sh_x86_emulate_write(). Just before the lock
>is taken...
>
>- Guest vcpu 2 takes a pagefault to VA 0x0 (and is using the same
> pagetable as vcpu 1 -- unlikely but possible, and less crazy
>scenarios
> have the same effect on 4-level pagetables).
>
>- The shadow fault handler sees 0x00002067, and maps GFN 0x2 as the
> guest l1 table. It sees that there is already a shadow l1 table
> present; so it propagates the contents of an l1 from GFN 0x2 into MFN
> 0xA, which is the wrong shadow.
>
>- Guest vcpu 1's emulation now acquires the shadow lock and updates the
> shadow l2e to 0x0000B067, but it's too late - the shadows are already
> corrupted, and no amount of TLB flushing will make it better.
Yes, this one is not recoverable. Not like a physical processor which can
even recover as long as TLB is flushed later. It's a good lesson to me
about invariant relationship between shadow entry and guest one.
>
>Sorry if that's a bit long-winded; needed to make sure it was
>still true
>after the recent rework of the shadow PT walker.
>
So I admit that my patch is a bit incorrectly aggressive, and guest table
update must be always coupled with shadow sync logic within lock.
It's not worthy of breaking it which may introduce more overhead in
normal path.
At this time, it's natural to ask how about just releasing the lock before
the point updating guest entry. But no, I'll hold this time to think more
in case of similar ignorance in other path, though my mind tempts to
persuade me then it should be safe. :-P I just really want to release
unnecessary lock contention which is worse after scaled up...
Thanks,
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|