WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH 5 of 5] Modify naming of queries into the p2m

To: "Tim Deegan" <tim@xxxxxxx>
Subject: Re: [Xen-devel] [PATCH 5 of 5] Modify naming of queries into the p2m
From: "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>
Date: Thu, 10 Nov 2011 08:47:55 -0800
Cc: olaf@xxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, george.dunlap@xxxxxxxxxxxxx, andres@xxxxxxxxxxxxxx, keir.xen@xxxxxxxxx, Andres Lagar-Cavilla <andres@xxxxxxxxxxxxxxxx>, adin@xxxxxxxxxxxxxx
Delivery-date: Thu, 10 Nov 2011 08:48:53 -0800
Dkim-signature: v=1; a=rsa-sha1; c=relaxed; d=lagarcavilla.org; h= message-id:in-reply-to:references:date:subject:from:to:cc :reply-to:mime-version:content-type:content-transfer-encoding; s=lagarcavilla.org; bh=0433hQS9AiG2kkpx8AIgjbxb4YA=; b=R4Mi25sP 4tvJoXTVPG7AQOa0jkruMoFXcXYeo6MFLRGi4pK2PQ4VmHK7K0lHJcRYPG2Fib4b AzNIc74+JnLKVaz3q2elIhXGV2vcWi5wIS8P7E8+gaJhXFjMvvoAEpvXYcV52l3w lq7/u2fTVidoWSulFOoVGvHYB4Ve+ak221U=
Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=message-id :in-reply-to:references:date:subject:from:to:cc:reply-to :mime-version:content-type:content-transfer-encoding; q=dns; s= lagarcavilla.org; b=n/oG/poSK38NUNVaCPAKvwxRCJUrKI5Aj5OyMsxTSDCc 30sTiOnKrjWoeR+Bl/73zf58JLZFD9IEsL9CxekytF7hklaGOCEzL+kZdeKOfEBn Ozow5/JZwvAkH9qQTbkksOWfnSmoD79/heRhJbawAJh/i+G0FT4G8lFTrQaMu+g=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20111110122634.GD62117@xxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <patchbomb.1320722908@xxxxxxxxxxxxxxxxxxx> <e9fd07479f684f558a92.1320722913@xxxxxxxxxxxxxxxxxxx> <20111110122634.GD62117@xxxxxxxxxxxxxxxxxxxxx>
Reply-to: andres@xxxxxxxxxxxxxxxx
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: SquirrelMail/1.4.21
Tim, see below re nested-SVM code:
> Hi,
>
> This mostly looks good; I have a few comments on the detail below.
> If those are addressed I hope I can apply the next version.
>
> Any other maintainers want to object to this renaming?
>
> At 22:28 -0500 on 07 Nov (1320704913), Andres Lagar-Cavilla wrote:
>> @@ -1182,6 +1195,7 @@ int hvm_hap_nested_page_fault(unsigned l
>>      mfn_t mfn;
>>      struct vcpu *v = current;
>>      struct p2m_domain *p2m;
>> +    int rc;
>>
>>      /* On Nested Virtualization, walk the guest page table.
>>       * If this succeeds, all is fine.
>> @@ -1255,8 +1269,8 @@ int hvm_hap_nested_page_fault(unsigned l
>>          if ( violation )
>>          {
>>              p2m_mem_access_check(gpa, gla_valid, gla, access_r,
>> access_w, access_x);
>> -
>> -            return 1;
>> +            rc = 1;
>> +            goto out_put_gfn;
>
> Shouldn't this patch be changing the call to gfn_to_mfn_type_p2m() just
> above here to a get_gfn_*() call too?
>
>> -void hvm_unmap_guest_frame(void *p)
>> +void hvm_unmap_guest_frame(void *p, unsigned long addr, int is_va)
>>  {
>> +    /* We enter this function with a map obtained in
>> __hvm_map_guest_frame.
>> +     * This map performed a p2m query that locked the gfn entry and got
>> +     * a ref on the mfn. Must undo */
>>      if ( p )
>> +    {
>> +        unsigned long gfn = INVALID_GFN;
>> +
>> +        if ( is_va )
>> +        {
>> +            if ( addr )
>> +            {
>> +                uint32_t pfec = PFEC_page_present;
>> +                gfn = paging_gva_to_gfn(current, addr, &pfec);
>> +            } else {
>> +                gfn = INVALID_GFN;
>> +            }
>> +        } else {
>> +            gfn = addr;
>> +        }
>> +
>> +        if ( gfn != INVALID_GFN )
>> +            put_gfn(current->domain, gfn);
>> +
>>          unmap_domain_page(p);
>> +    }
>>  }
>
> This is a pretty wierd-looking function now - I think it would be better
> just to make all callers have to remember the GFN.  In fact, since a
> guest VCPU can change its pagetables at any time, it's not safe to use
> paging_gva_to_gfn() to regenerate the GFN from a VA.
>
> Also, IIRC the nested-SVM code keeps the hvm_map mapping around for
> quite a long time so in fact this unmap-and-put-gfn may not be the right
> interface.  Maybe the caller should put_gfn() explicitly.

I just looked into nestedsvm_vmcb_map. It would seem that the map it
stores in nv_vvmcx survives beyond the vmexit. So, if we don't put_gfn we
would be entering the scheduler in an atomic context.

The shorthand solution is to put_gfn, but that would render the map
potentially unsafe on re-entry. I don't think so, so I'll push forward
with that.

Is it possible to teardown and re-establish the maps on each hypervisor
entry/exit? That would seem like the ultimate solution.

Andres
>
>> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/hvm/svm/svm.c
>> --- a/xen/arch/x86/hvm/svm/svm.c
>> +++ b/xen/arch/x86/hvm/svm/svm.c
>> @@ -244,9 +244,10 @@ static int svm_vmcb_restore(struct vcpu
>>      {
>>          if ( c->cr0 & X86_CR0_PG )
>>          {
>> -            mfn = mfn_x(gfn_to_mfn(v->domain, c->cr3 >> PAGE_SHIFT,
>> &p2mt));
>> +            mfn = mfn_x(get_gfn(v->domain, c->cr3 >> PAGE_SHIFT,
>> &p2mt));
>>              if ( !p2m_is_ram(p2mt) || !get_page(mfn_to_page(mfn),
>> v->domain) )
>>              {
>> +                put_gfn(v->domain, c->cr3 >> PAGE_SHIFT);
>>                  gdprintk(XENLOG_ERR, "Invalid CR3 value=0x%"PRIx64"\n",
>>                           c->cr3);
>>                  return -EINVAL;
>> @@ -257,6 +258,8 @@ static int svm_vmcb_restore(struct vcpu
>>              put_page(pagetable_get_page(v->arch.guest_table));
>>
>>          v->arch.guest_table = pagetable_from_pfn(mfn);
>> +        if ( c->cr0 & X86_CR0_PG )
>> +            put_gfn(v->domain, c->cr3 >> PAGE_SHIFT);
>>      }
>>
>>      v->arch.hvm_vcpu.guest_cr[0] = c->cr0 | X86_CR0_ET;
>> @@ -1144,7 +1147,6 @@ static void svm_do_nested_pgfault(struct
>>      unsigned long gfn = gpa >> PAGE_SHIFT;
>>      mfn_t mfn;
>>      p2m_type_t p2mt;
>> -    p2m_access_t p2ma;
>>      struct p2m_domain *p2m = NULL;
>>
>>      ret = hvm_hap_nested_page_fault(gpa, 0, ~0ul, 0, 0, 0, 0);
>> @@ -1161,7 +1163,8 @@ static void svm_do_nested_pgfault(struct
>>          p2m = p2m_get_p2m(v);
>>          _d.gpa = gpa;
>>          _d.qualification = 0;
>> -        _d.mfn = mfn_x(gfn_to_mfn_type_p2m(p2m, gfn, &_d.p2mt, &p2ma,
>> p2m_query, NULL));
>> +        mfn = get_gfn_query_unlocked(p2m->domain, gfn, &_d.p2mt);
>> +        _d.mfn = mfn_x(mfn);
>
> Ah - this is not quite the same thing.  This lookup uses a specific p2m
> table (possibly a nested-p2m table reflecting the fact the guest is
> running in nested SVM mode) so it can't be replaced by a call that just
> takes the domain pointer (and will internally use the domain's master
> p2m table).
>
> The naming of 'p2m_get_p2m()' is particularly unhelpful here, I realise.
> It fetches the running p2m, i.e. the one that hardware sees as an NPT
> table; almost everywhere else uses p2m_get_hostp2m(), which
> fetches the master p2m.  But in general when you see the
> gfn_to_mfn_*_p2m() functions, that take an explicit p2m pointer, you
> need to be careful.
>
>>
>>          __trace_var(TRC_HVM_NPF, 0, sizeof(_d), &_d);
>>      }
>> @@ -1181,7 +1184,7 @@ static void svm_do_nested_pgfault(struct
>>      if ( p2m == NULL )
>>          p2m = p2m_get_p2m(v);
>>      /* Everything else is an error. */
>> -    mfn = gfn_to_mfn_type_p2m(p2m, gfn, &p2mt, &p2ma, p2m_guest, NULL);
>> +    mfn = get_gfn_guest_unlocked(p2m->domain, gfn, &p2mt);
>
> Likewise here.
>
>> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/mm/guest_walk.c
>> --- a/xen/arch/x86/mm/guest_walk.c
>> +++ b/xen/arch/x86/mm/guest_walk.c
>> @@ -86,30 +86,33 @@ static uint32_t set_ad_bits(void *guest_
>>      return 0;
>>  }
>>
>> +/* If the map is non-NULL, we leave this function having
>> + * called get_gfn, you need to call put_gfn. */
>>  static inline void *map_domain_gfn(struct p2m_domain *p2m,
>>                                     gfn_t gfn,
>>                                     mfn_t *mfn,
>>                                     p2m_type_t *p2mt,
>>                                     uint32_t *rc)
>>  {
>> -    p2m_access_t a;
>> -
>>      /* Translate the gfn, unsharing if shared */
>> -    *mfn = gfn_to_mfn_type_p2m(p2m, gfn_x(gfn), p2mt, &a, p2m_unshare,
>> NULL);
>> +    *mfn = get_gfn_unshare(p2m->domain, gfn_x(gfn), p2mt);
>
> Here's another case where we need to handle the nested-hap path; the p2m
> pointer here must be propagated into the lookup function.
>
>>      if ( p2m_is_paging(*p2mt) )
>>      {
>>          ASSERT(!p2m_is_nestedp2m(p2m));
>>          p2m_mem_paging_populate(p2m->domain, gfn_x(gfn));
>> +        put_gfn(p2m->domain, gfn_x(gfn));
>>          *rc = _PAGE_PAGED;
>>          return NULL;
>>      }
>>      if ( p2m_is_shared(*p2mt) )
>>      {
>> +        put_gfn(p2m->domain, gfn_x(gfn));
>>          *rc = _PAGE_SHARED;
>>          return NULL;
>>      }
>>      if ( !p2m_is_ram(*p2mt) )
>>      {
>> +        put_gfn(p2m->domain, gfn_x(gfn));
>>          *rc |= _PAGE_PRESENT;
>>          return NULL;
>>      }
>> @@ -120,6 +123,9 @@ static inline void *map_domain_gfn(struc
>>
>>
>>  /* Walk the guest pagetables, after the manner of a hardware walker. */
>> +/* Because the walk is essentially random, it can cause a deadlock
>> + * warning in the p2m locking code. Highly unlikely this is an actual
>> + * deadlock, because who would walk page table in the opposite order?
>> */
>
> Linear pagetables make this sort of thing more likely, especially if
> they're used by one process to update another process's mappings.
>
> If this is a problem, maybe we'll need to avoid the deadlock either by
> allowing multiple lock-holders in the p2m or by rearranging the a/d
> writeback soit does another p2m lookup  -- I'd rather not do that,
> though, since even on 64-bit it will add a lot of memory accesses.
>
> I'm happy to take a version of this big switchover patch that doesn't
> solve this problem, though.  It can be sorted out in a separate patch.
>
>> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/mm/hap/nested_hap.c
>> --- a/xen/arch/x86/mm/hap/nested_hap.c
>> +++ b/xen/arch/x86/mm/hap/nested_hap.c
>> @@ -146,22 +146,29 @@ nestedhap_walk_L0_p2m(struct p2m_domain
>>      mfn_t mfn;
>>      p2m_type_t p2mt;
>>      p2m_access_t p2ma;
>> +    int rc;
>>
>>      /* walk L0 P2M table */
>>      mfn = gfn_to_mfn_type_p2m(p2m, L1_gpa >> PAGE_SHIFT, &p2mt, &p2ma,
>>                                p2m_query, page_order);
>
> Again, are we not changing this function's name?
>
>
> Cheers,
>
> Tim.
>



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel