WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH 5 of 5] Modify naming of queries into the p2m

To: "Tim Deegan" <tim@xxxxxxx>
Subject: Re: [Xen-devel] [PATCH 5 of 5] Modify naming of queries into the p2m
From: "Andres Lagar-Cavilla" <andres@xxxxxxxxxxxxxxxx>
Date: Thu, 10 Nov 2011 05:54:17 -0800
Cc: olaf@xxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, george.dunlap@xxxxxxxxxxxxx, andres@xxxxxxxxxxxxxx, keir.xen@xxxxxxxxx, adin@xxxxxxxxxxxxxx
Delivery-date: Thu, 10 Nov 2011 05:54:49 -0800
Dkim-signature: v=1; a=rsa-sha1; c=relaxed; d=lagarcavilla.org; h= message-id:in-reply-to:references:date:subject:from:to:cc :reply-to:mime-version:content-type:content-transfer-encoding; s=lagarcavilla.org; bh=+WksUG6Zj0xt4rFRJ0so1pj/scY=; b=Yap/zpNh E+6aPCCEFATqCLzP7U0J0zd13xc5VDnv5RwRSRd24IzUBkhWKNT7hB455Jq3xDSu uyMKrpjOvZ+RV8mAa5SXzCm0XE2rDrRlmBk5YAL/7n6b8TfNpSGMuWfCvuNnm6kB +f7sVGcaUB15YccGKF7+3I8TyGsveN1jILU=
Domainkey-signature: a=rsa-sha1; c=nofws; d=lagarcavilla.org; h=message-id :in-reply-to:references:date:subject:from:to:cc:reply-to :mime-version:content-type:content-transfer-encoding; q=dns; s= lagarcavilla.org; b=kwOcivF6cZu2ZlwtlcpQGH2V2FlgvMWXEEcqtEqeAomb Yc3kGuTGx7SAChE9gNg9yTJHWbkkuNyQOlRo9HjWYYjz6AW6R6lS70gXLVO8WpNx BR4qr2a+L5NZB+beq7ViSOEfXBQ8ANP9j7orDZx/ABm1HCN7GgQZAS64oUFRALw=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20111110122634.GD62117@xxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <patchbomb.1320722908@xxxxxxxxxxxxxxxxxxx> <e9fd07479f684f558a92.1320722913@xxxxxxxxxxxxxxxxxxx> <20111110122634.GD62117@xxxxxxxxxxxxxxxxxxxxx>
Reply-to: andres@xxxxxxxxxxxxxxxx
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: SquirrelMail/1.4.21
Ok, thanks. I'll try to address those comments asap in a new version.

Off-the-bat:
- sharing deadlock: I can arrange it so that shr_lock is always taken
after get_gfn. That would still require either disabling deadlock
detection or moving shr_lock down the deadlock order. Which one is
preferable?

The real solution that I have queued is removing the global hash table,
and locking each shared page individually using PGT_locked.

- get_mfn: the goal is additional belts-and-braces. It doesn't require an
extra argument. Put_gfn will put the mfn that sits in the p2m entry at the
 time of put. A few tricks are needed to handle remove_page, sharing, etc.
Works for me with windows and linux hvm guests (so far)

- I can rename gfn_to_mfn_type_p2m to get_gfn_type_access (since that's
what the remaining uses are for)

Thanks for your feedback on the nested p2m functions, which I quite don't
get yet.

Andres
> Hi,
>
> This mostly looks good; I have a few comments on the detail below.
> If those are addressed I hope I can apply the next version.
>
> Any other maintainers want to object to this renaming?
>
> At 22:28 -0500 on 07 Nov (1320704913), Andres Lagar-Cavilla wrote:
>> @@ -1182,6 +1195,7 @@ int hvm_hap_nested_page_fault(unsigned l
>>      mfn_t mfn;
>>      struct vcpu *v = current;
>>      struct p2m_domain *p2m;
>> +    int rc;
>>
>>      /* On Nested Virtualization, walk the guest page table.
>>       * If this succeeds, all is fine.
>> @@ -1255,8 +1269,8 @@ int hvm_hap_nested_page_fault(unsigned l
>>          if ( violation )
>>          {
>>              p2m_mem_access_check(gpa, gla_valid, gla, access_r,
>> access_w, access_x);
>> -
>> -            return 1;
>> +            rc = 1;
>> +            goto out_put_gfn;
>
> Shouldn't this patch be changing the call to gfn_to_mfn_type_p2m() just
> above here to a get_gfn_*() call too?
>
>> -void hvm_unmap_guest_frame(void *p)
>> +void hvm_unmap_guest_frame(void *p, unsigned long addr, int is_va)
>>  {
>> +    /* We enter this function with a map obtained in
>> __hvm_map_guest_frame.
>> +     * This map performed a p2m query that locked the gfn entry and got
>> +     * a ref on the mfn. Must undo */
>>      if ( p )
>> +    {
>> +        unsigned long gfn = INVALID_GFN;
>> +
>> +        if ( is_va )
>> +        {
>> +            if ( addr )
>> +            {
>> +                uint32_t pfec = PFEC_page_present;
>> +                gfn = paging_gva_to_gfn(current, addr, &pfec);
>> +            } else {
>> +                gfn = INVALID_GFN;
>> +            }
>> +        } else {
>> +            gfn = addr;
>> +        }
>> +
>> +        if ( gfn != INVALID_GFN )
>> +            put_gfn(current->domain, gfn);
>> +
>>          unmap_domain_page(p);
>> +    }
>>  }
>
> This is a pretty wierd-looking function now - I think it would be better
> just to make all callers have to remember the GFN.  In fact, since a
> guest VCPU can change its pagetables at any time, it's not safe to use
> paging_gva_to_gfn() to regenerate the GFN from a VA.
>
> Also, IIRC the nested-SVM code keeps the hvm_map mapping around for
> quite a long time so in fact this unmap-and-put-gfn may not be the right
> interface.  Maybe the caller should put_gfn() explicitly.
>
>> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/hvm/svm/svm.c
>> --- a/xen/arch/x86/hvm/svm/svm.c
>> +++ b/xen/arch/x86/hvm/svm/svm.c
>> @@ -244,9 +244,10 @@ static int svm_vmcb_restore(struct vcpu
>>      {
>>          if ( c->cr0 & X86_CR0_PG )
>>          {
>> -            mfn = mfn_x(gfn_to_mfn(v->domain, c->cr3 >> PAGE_SHIFT,
>> &p2mt));
>> +            mfn = mfn_x(get_gfn(v->domain, c->cr3 >> PAGE_SHIFT,
>> &p2mt));
>>              if ( !p2m_is_ram(p2mt) || !get_page(mfn_to_page(mfn),
>> v->domain) )
>>              {
>> +                put_gfn(v->domain, c->cr3 >> PAGE_SHIFT);
>>                  gdprintk(XENLOG_ERR, "Invalid CR3 value=0x%"PRIx64"\n",
>>                           c->cr3);
>>                  return -EINVAL;
>> @@ -257,6 +258,8 @@ static int svm_vmcb_restore(struct vcpu
>>              put_page(pagetable_get_page(v->arch.guest_table));
>>
>>          v->arch.guest_table = pagetable_from_pfn(mfn);
>> +        if ( c->cr0 & X86_CR0_PG )
>> +            put_gfn(v->domain, c->cr3 >> PAGE_SHIFT);
>>      }
>>
>>      v->arch.hvm_vcpu.guest_cr[0] = c->cr0 | X86_CR0_ET;
>> @@ -1144,7 +1147,6 @@ static void svm_do_nested_pgfault(struct
>>      unsigned long gfn = gpa >> PAGE_SHIFT;
>>      mfn_t mfn;
>>      p2m_type_t p2mt;
>> -    p2m_access_t p2ma;
>>      struct p2m_domain *p2m = NULL;
>>
>>      ret = hvm_hap_nested_page_fault(gpa, 0, ~0ul, 0, 0, 0, 0);
>> @@ -1161,7 +1163,8 @@ static void svm_do_nested_pgfault(struct
>>          p2m = p2m_get_p2m(v);
>>          _d.gpa = gpa;
>>          _d.qualification = 0;
>> -        _d.mfn = mfn_x(gfn_to_mfn_type_p2m(p2m, gfn, &_d.p2mt, &p2ma,
>> p2m_query, NULL));
>> +        mfn = get_gfn_query_unlocked(p2m->domain, gfn, &_d.p2mt);
>> +        _d.mfn = mfn_x(mfn);
>
> Ah - this is not quite the same thing.  This lookup uses a specific p2m
> table (possibly a nested-p2m table reflecting the fact the guest is
> running in nested SVM mode) so it can't be replaced by a call that just
> takes the domain pointer (and will internally use the domain's master
> p2m table).
>
> The naming of 'p2m_get_p2m()' is particularly unhelpful here, I realise.
> It fetches the running p2m, i.e. the one that hardware sees as an NPT
> table; almost everywhere else uses p2m_get_hostp2m(), which
> fetches the master p2m.  But in general when you see the
> gfn_to_mfn_*_p2m() functions, that take an explicit p2m pointer, you
> need to be careful.
>
>>
>>          __trace_var(TRC_HVM_NPF, 0, sizeof(_d), &_d);
>>      }
>> @@ -1181,7 +1184,7 @@ static void svm_do_nested_pgfault(struct
>>      if ( p2m == NULL )
>>          p2m = p2m_get_p2m(v);
>>      /* Everything else is an error. */
>> -    mfn = gfn_to_mfn_type_p2m(p2m, gfn, &p2mt, &p2ma, p2m_guest, NULL);
>> +    mfn = get_gfn_guest_unlocked(p2m->domain, gfn, &p2mt);
>
> Likewise here.
>
>> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/mm/guest_walk.c
>> --- a/xen/arch/x86/mm/guest_walk.c
>> +++ b/xen/arch/x86/mm/guest_walk.c
>> @@ -86,30 +86,33 @@ static uint32_t set_ad_bits(void *guest_
>>      return 0;
>>  }
>>
>> +/* If the map is non-NULL, we leave this function having
>> + * called get_gfn, you need to call put_gfn. */
>>  static inline void *map_domain_gfn(struct p2m_domain *p2m,
>>                                     gfn_t gfn,
>>                                     mfn_t *mfn,
>>                                     p2m_type_t *p2mt,
>>                                     uint32_t *rc)
>>  {
>> -    p2m_access_t a;
>> -
>>      /* Translate the gfn, unsharing if shared */
>> -    *mfn = gfn_to_mfn_type_p2m(p2m, gfn_x(gfn), p2mt, &a, p2m_unshare,
>> NULL);
>> +    *mfn = get_gfn_unshare(p2m->domain, gfn_x(gfn), p2mt);
>
> Here's another case where we need to handle the nested-hap path; the p2m
> pointer here must be propagated into the lookup function.
>
>>      if ( p2m_is_paging(*p2mt) )
>>      {
>>          ASSERT(!p2m_is_nestedp2m(p2m));
>>          p2m_mem_paging_populate(p2m->domain, gfn_x(gfn));
>> +        put_gfn(p2m->domain, gfn_x(gfn));
>>          *rc = _PAGE_PAGED;
>>          return NULL;
>>      }
>>      if ( p2m_is_shared(*p2mt) )
>>      {
>> +        put_gfn(p2m->domain, gfn_x(gfn));
>>          *rc = _PAGE_SHARED;
>>          return NULL;
>>      }
>>      if ( !p2m_is_ram(*p2mt) )
>>      {
>> +        put_gfn(p2m->domain, gfn_x(gfn));
>>          *rc |= _PAGE_PRESENT;
>>          return NULL;
>>      }
>> @@ -120,6 +123,9 @@ static inline void *map_domain_gfn(struc
>>
>>
>>  /* Walk the guest pagetables, after the manner of a hardware walker. */
>> +/* Because the walk is essentially random, it can cause a deadlock
>> + * warning in the p2m locking code. Highly unlikely this is an actual
>> + * deadlock, because who would walk page table in the opposite order?
>> */
>
> Linear pagetables make this sort of thing more likely, especially if
> they're used by one process to update another process's mappings.
>
> If this is a problem, maybe we'll need to avoid the deadlock either by
> allowing multiple lock-holders in the p2m or by rearranging the a/d
> writeback soit does another p2m lookup  -- I'd rather not do that,
> though, since even on 64-bit it will add a lot of memory accesses.
>
> I'm happy to take a version of this big switchover patch that doesn't
> solve this problem, though.  It can be sorted out in a separate patch.
>
>> diff -r f07d4ebe5248 -r e9fd07479f68 xen/arch/x86/mm/hap/nested_hap.c
>> --- a/xen/arch/x86/mm/hap/nested_hap.c
>> +++ b/xen/arch/x86/mm/hap/nested_hap.c
>> @@ -146,22 +146,29 @@ nestedhap_walk_L0_p2m(struct p2m_domain
>>      mfn_t mfn;
>>      p2m_type_t p2mt;
>>      p2m_access_t p2ma;
>> +    int rc;
>>
>>      /* walk L0 P2M table */
>>      mfn = gfn_to_mfn_type_p2m(p2m, L1_gpa >> PAGE_SHIFT, &p2mt, &p2ma,
>>                                p2m_query, page_order);
>
> Again, are we not changing this function's name?
>
>
> Cheers,
>
> Tim.
>



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel