Re: [Xen-devel] [PATCH 5 of 5] Modify naming of queries into the

To:	"Tim Deegan" <tim@xxxxxxx>
Subject:	Re: [Xen-devel] [PATCH 5 of 5] Modify naming of queries into the p2m
From:	"Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>
Date:	Thu, 10 Nov 2011 05:54:17 -0800
Cc:	olaf@xxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, george.dunlap@xxxxxxxxxxxxx, andres@xxxxxxxxxxxxxx, keir.xen@xxxxxxxxx, adin@xxxxxxxxxxxxxx
Delivery-date:	Thu, 10 Nov 2011 05:54:49 -0800
Dkim-signature:	v=1; a=rsa-sha1; c=relaxed; d=lagarcavilla.org; h= message-id:in-reply-to:references:date:subject:from:to:cc :reply-to:mime-version:content-type:content-transfer-encoding; s=lagarcavilla.org; bh=+WksUG6Zj0xt4rFRJ0so1pj/scY=; b=Yap/zpNh E+6aPCCEFATqCLzP7U0J0zd13xc5VDnv5RwRSRd24IzUBkhWKNT7hB455Jq3xDSu uyMKrpjOvZ+RV8mAa5SXzCm0XE2rDrRlmBk5YAL/7n6b8TfNpSGMuWfCvuNnm6kB +f7sVGcaUB15YccGKF7+3I8TyGsveN1jILU=
Domainkey-signature:	a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=message-id :in-reply-to:references:date:subject:from:to:cc:reply-to :mime-version:content-type:content-transfer-encoding; q=dns; s= lagarcavilla.org; b=kwOcivF6cZu2ZlwtlcpQGH2V2FlgvMWXEEcqtEqeAomb Yc3kGuTGx7SAChE9gNg9yTJHWbkkuNyQOlRo9HjWYYjz6AW6R6lS70gXLVO8WpNx BR4qr2a+L5NZB+beq7ViSOEfXBQ8ANP9j7orDZx/ABm1HCN7GgQZAS64oUFRALw=
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<20111110122634.GD62117@xxxxxxxxxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<patchbomb.1320722908@xxxxxxxxxxxxxxxxxxx> <e9fd07479f684f558a92.1320722913@xxxxxxxxxxxxxxxxxxx> <20111110122634.GD62117@xxxxxxxxxxxxxxxxxxxxx>
Reply-to:	andres@xxxxxxxxxxxxxxxx
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent:	SquirrelMail/1.4.21

Ok, thanks. I'll try to address those comments asap in a new version.

Off-the-bat:
- sharing deadlock: I can arrange it so that shr_lock is always taken
after get_gfn. That would still require either disabling deadlock
detection or moving shr_lock down the deadlock order. Which one is
preferable?

The real solution that I have queued is removing the global hash table,
and locking each shared page individually using PGT_locked.

- get_mfn: the goal is additional belts-and-braces. It doesn't require an
extra argument. Put_gfn will put the mfn that sits in the p2m entry at the
 time of put. A few tricks are needed to handle remove_page, sharing, etc.
Works for me with windows and linux hvm guests (so far)

- I can rename gfn_to_mfn_type_p2m to get_gfn_type_access (since that's
what the remaining uses are for)

Thanks for your feedback on the nested p2m functions, which I quite don't
get yet.

Andres
> Hi,
>
> This mostly looks good; I have a few comments on the detail below.
> If those are addressed I hope I can apply the next version.
>
> Any other maintainers want to object to this renaming?
>
> At 22:28 -0500 on 07 Nov (1320704913), Andres Lagar-Cavilla wrote:
>> @@ -1182,6 +1195,7 @@ int hvm_hap_nested_page_fault(unsigned l
>>      mfn_t mfn;
>>      struct vcpu *v = current;
>>      struct p2m_domain *p2m;
>> +    int rc;
>>
>>      /* On Nested Virtualization, walk the guest page table.
>>       * If this succeeds, all is fine.
>> @@ -1255,8 +1269,8 @@ int hvm_hap_nested_page_fault(unsigned l
>>          if ( violation )
>>          {
>>              p2m_mem_access_check(gpa, gla_valid, gla, access_r,
>> access_w, access_x);
>> -
>> -            return 1;
>> +            rc = 1;
>> +            goto out_put_gfn;
>
> Shouldn't this patch be changing the call to gfn_to_mfn_type_p2m() just
> above here to a get_gfn_*() call too?
>
>> -void hvm_unmap_guest_frame(void *p)
>> +void hvm_unmap_guest_frame(void *p, unsigned long addr, int is_va)
>>  {
>> +    /* We enter this function with a map obtained in
>> __hvm_map_guest_frame.
>> +     * This map performed a p2m query that locked the gfn entry and got
>> +     * a ref on the mfn. Must undo */
>>      if ( p )
>> +    {
>> +        unsigned long gfn = INVALID_GFN;
>> +
>> +        if ( is_va )
>> +        {
>> +            if ( addr )
>> +            {
>> +                uint32_t pfec = PFEC_page_present;
>> +                gfn = paging_gva_to_gfn(current, addr, &pfec);
>> +            } else {
>> +                gfn = INVALID_GFN;
>> +            }
>> +        } else {
>> +            gfn = addr;
>> +        }
>> +
>> +        if ( gfn != INVALID_GFN )
>> +            put_gfn(current->domain, gfn);
>> +
>>          unmap_domain_page(p);
>> +    }
>>  }
>
> This is a pretty wierd-looking function now - I think it would be better
> just to make all callers have to remember the GFN.  In fact, since a
> guest VCPU can change its pagetables at any time, it's not safe to use
> paging_gva_to_gfn() to regenerate the GFN from a VA.
>
> Also, IIRC the nested-SVM code keeps the hvm_map mapping around for
> quite a long time so in fact this unmap-and-put-gfn may not be the right
> interface.  Maybe the caller should put_gfn() explicitly.
>
>> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/hvm/svm/svm.c
>> --- a/xen/arch/x86/hvm/svm/svm.c
>> +++ b/xen/arch/x86/hvm/svm/svm.c
>> @@ -244,9 +244,10 @@ static int svm_vmcb_restore(struct vcpu
>>      {
>>          if ( c->cr0 & X86_CR0_PG )
>>          {
>> -            mfn = mfn_x(gfn_to_mfn(v->domain, c->cr3 >> PAGE_SHIFT,
>> &p2mt));
>> +            mfn = mfn_x(get_gfn(v->domain, c->cr3 >> PAGE_SHIFT,
>> &p2mt));
>>              if ( !p2m_is_ram(p2mt) || !get_page(mfn_to_page(mfn),
>> v->domain) )
>>              {
>> +                put_gfn(v->domain, c->cr3 >> PAGE_SHIFT);
>>                  gdprintk(XENLOG_ERR, "Invalid CR3 value=0x%"PRIx64"\n",
>>                           c->cr3);
>>                  return -EINVAL;
>> @@ -257,6 +258,8 @@ static int svm_vmcb_restore(struct vcpu
>>              put_page(pagetable_get_page(v->arch.guest_table));
>>
>>          v->arch.guest_table = pagetable_from_pfn(mfn);
>> +        if ( c->cr0 & X86_CR0_PG )
>> +            put_gfn(v->domain, c->cr3 >> PAGE_SHIFT);
>>      }
>>
>>      v->arch.hvm_vcpu.guest_cr[0] = c->cr0 | X86_CR0_ET;
>> @@ -1144,7 +1147,6 @@ static void svm_do_nested_pgfault(struct
>>      unsigned long gfn = gpa >> PAGE_SHIFT;
>>      mfn_t mfn;
>>      p2m_type_t p2mt;
>> -    p2m_access_t p2ma;
>>      struct p2m_domain *p2m = NULL;
>>
>>      ret = hvm_hap_nested_page_fault(gpa, 0, ~0ul, 0, 0, 0, 0);
>> @@ -1161,7 +1163,8 @@ static void svm_do_nested_pgfault(struct
>>          p2m = p2m_get_p2m(v);
>>          _d.gpa = gpa;
>>          _d.qualification = 0;
>> -        _d.mfn = mfn_x(gfn_to_mfn_type_p2m(p2m, gfn, &_d.p2mt, &p2ma,
>> p2m_query, NULL));
>> +        mfn = get_gfn_query_unlocked(p2m->domain, gfn, &_d.p2mt);
>> +        _d.mfn = mfn_x(mfn);
>
> Ah - this is not quite the same thing.  This lookup uses a specific p2m
> table (possibly a nested-p2m table reflecting the fact the guest is
> running in nested SVM mode) so it can't be replaced by a call that just
> takes the domain pointer (and will internally use the domain's master
> p2m table).
>
> The naming of 'p2m_get_p2m()' is particularly unhelpful here, I realise.
> It fetches the running p2m, i.e. the one that hardware sees as an NPT
> table; almost everywhere else uses p2m_get_hostp2m(), which
> fetches the master p2m.  But in general when you see the
> gfn_to_mfn_*_p2m() functions, that take an explicit p2m pointer, you
> need to be careful.
>
>>
>>          __trace_var(TRC_HVM_NPF, 0, sizeof(_d), &_d);
>>      }
>> @@ -1181,7 +1184,7 @@ static void svm_do_nested_pgfault(struct
>>      if ( p2m == NULL )
>>          p2m = p2m_get_p2m(v);
>>      /* Everything else is an error. */
>> -    mfn = gfn_to_mfn_type_p2m(p2m, gfn, &p2mt, &p2ma, p2m_guest, NULL);
>> +    mfn = get_gfn_guest_unlocked(p2m->domain, gfn, &p2mt);
>
> Likewise here.
>
>> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/mm/guest_walk.c
>> --- a/xen/arch/x86/mm/guest_walk.c
>> +++ b/xen/arch/x86/mm/guest_walk.c
>> @@ -86,30 +86,33 @@ static uint32_t set_ad_bits(void *guest_
>>      return 0;
>>  }
>>
>> +/* If the map is non-NULL, we leave this function having
>> + * called get_gfn, you need to call put_gfn. */
>>  static inline void *map_domain_gfn(struct p2m_domain *p2m,
>>                                     gfn_t gfn,
>>                                     mfn_t *mfn,
>>                                     p2m_type_t *p2mt,
>>                                     uint32_t *rc)
>>  {
>> -    p2m_access_t a;
>> -
>>      /* Translate the gfn, unsharing if shared */
>> -    *mfn = gfn_to_mfn_type_p2m(p2m, gfn_x(gfn), p2mt, &a, p2m_unshare,
>> NULL);
>> +    *mfn = get_gfn_unshare(p2m->domain, gfn_x(gfn), p2mt);
>
> Here's another case where we need to handle the nested-hap path; the p2m
> pointer here must be propagated into the lookup function.
>
>>      if ( p2m_is_paging(*p2mt) )
>>      {
>>          ASSERT(!p2m_is_nestedp2m(p2m));
>>          p2m_mem_paging_populate(p2m->domain, gfn_x(gfn));
>> +        put_gfn(p2m->domain, gfn_x(gfn));
>>          *rc = _PAGE_PAGED;
>>          return NULL;
>>      }
>>      if ( p2m_is_shared(*p2mt) )
>>      {
>> +        put_gfn(p2m->domain, gfn_x(gfn));
>>          *rc = _PAGE_SHARED;
>>          return NULL;
>>      }
>>      if ( !p2m_is_ram(*p2mt) )
>>      {
>> +        put_gfn(p2m->domain, gfn_x(gfn));
>>          *rc |= _PAGE_PRESENT;
>>          return NULL;
>>      }
>> @@ -120,6 +123,9 @@ static inline void *map_domain_gfn(struc
>>
>>
>>  /* Walk the guest pagetables, after the manner of a hardware walker. */
>> +/* Because the walk is essentially random, it can cause a deadlock
>> + * warning in the p2m locking code. Highly unlikely this is an actual
>> + * deadlock, because who would walk page table in the opposite order?
>> */
>
> Linear pagetables make this sort of thing more likely, especially if
> they're used by one process to update another process's mappings.
>
> If this is a problem, maybe we'll need to avoid the deadlock either by
> allowing multiple lock-holders in the p2m or by rearranging the a/d
> writeback soit does another p2m lookup  -- I'd rather not do that,
> though, since even on 64-bit it will add a lot of memory accesses.
>
> I'm happy to take a version of this big switchover patch that doesn't
> solve this problem, though.  It can be sorted out in a separate patch.
>
>> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/mm/hap/nested_hap.c
>> --- a/xen/arch/x86/mm/hap/nested_hap.c
>> +++ b/xen/arch/x86/mm/hap/nested_hap.c
>> @@ -146,22 +146,29 @@ nestedhap_walk_L0_p2m(struct p2m_domain
>>      mfn_t mfn;
>>      p2m_type_t p2mt;
>>      p2m_access_t p2ma;
>> +    int rc;
>>
>>      /* walk L0 P2M table */
>>      mfn = gfn_to_mfn_type_p2m(p2m, L1_gpa >> PAGE_SHIFT, &p2mt, &p2ma,
>>                                p2m_query, page_order);
>
> Again, are we not changing this function's name?
>
>
> Cheers,
>
> Tim.
>



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] [PATCH 5 of 5] Modify naming of queries into the p2m