Hi,
Thanks for the patches. As Keir points out, we'll need them in a form
that applies cleanly (maybe send as attachments next time). It's also
useful when reviewing patches if they're in 'diff -p' format. In
mercurial, you can arrabnge that by adding these lines to your ~/.hgrc
file:
[diff]
showfunc = True
Further comments inline below. Nothing too hard to address, I think.
At 22:07 +0000 on 04 Jan (1294178833), Joe Epstein wrote:
> diff -r a3cec4b94150 -r 06b0916eb91d xen/include/public/domctl.h
> --- a/xen/include/public/domctl.h Tue Jan 04 10:30:15 2011 -0800
> +++ b/xen/include/public/domctl.h Tue Jan 04 11:59:48 2011 -0800
> @@ -714,7 +714,7 @@
> /*
> * Page memory in and out.
> */
> -#define XEN_DOMCTL_MEM_EVENT_OP_PAGING (1 << 0)
> +#define XEN_DOMCTL_MEM_EVENT_OP_PAGING 1
>
> /* Domain memory paging */
> #define XEN_DOMCTL_MEM_EVENT_OP_PAGING_NOMINATE 0
> @@ -722,6 +722,12 @@
> #define XEN_DOMCTL_MEM_EVENT_OP_PAGING_PREP 2
> #define XEN_DOMCTL_MEM_EVENT_OP_PAGING_RESUME 3
>
> +/*
> + * Access permissions
> + */
> +#define XEN_DOMCTL_MEM_EVENT_OP_ACCESS 2
> +#define XEN_DOMCTL_MEM_EVENT_OP_ACCESS_RESUME 0
These could do with a nice big block comment that describes what they
do.
[...]
> diff -r a3cec4b94150 -r 06b0916eb91d xen/include/asm-x86/mem_event.h
> --- a/xen/include/asm-x86/mem_event.h Tue Jan 04 10:30:15 2011 -0800
> +++ b/xen/include/asm-x86/mem_event.h Tue Jan 04 11:59:48 2011 -0800
> @@ -24,6 +24,8 @@
> #ifndef __MEM_EVENT_H__
> #define __MEM_EVENT_H__
>
> +/* Returns true if a listener exists, else pauses VCPU */
> +int mem_event_check_listener(struct domain *d);
That's a pretty odd thing for a function to do. I see below that the
only caller already knows there's no listener and ignores the return
value. I think you can just get rid of this function entirely.
[...]
> +
> + /*
> + * If this GFN is emulated MMIO or marked as read-only (which old
> MMIO is),
> + * pass the fault to the mmio handler first.
> + */
Why did you change this comment? Pages marked as p2m_ram_ro are not
"old MMIO", they're ROM images &c, and go through the emulator so the
write can be correctly discarded.
> + if ( (p2mt == p2m_mmio_dm) || (p2mt == p2m_ram_ro) )
> {
> - /*
> - * Page log dirty is always done with order 0. If this mfn resides in
> - * a large page, we do not change other pages type within that large
> - * page.
> - */
> - paging_mark_dirty(v->domain, mfn_x(mfn));
> - p2m_change_type(p2m, gfn, p2m_ram_logdirty, p2m_ram_rw);
> + if ( !handle_mmio() )
> + {
> + guest_fault = 1;
> + goto check_access_handler;
> + }
> return 1;
> }
>
> - /* Shouldn't happen: Maybe the guest was writing to a r/o grant mapping?
> */
> - if ( p2mt == p2m_grant_map_ro )
> + /* Was it a write access: log-dirty, etc... */
> + if ( access_w ) {
> + /* PoD and log-dirty also take this path. */
Moving the log-dirty check inside the test for access_w reintroduces a
race condition in multi-vcpu systems (where the same log-dirty fault
happens on two CPUs and the first CPU has changed the type back to
p2m_ram_rw by the time the second one looks up the type). Please put
this case back unconditionally.
> + if ( p2mt != p2m_ram_rw && p2m_is_ram(p2mt) )
> + {
> + /*
> + * Page log dirty is always done with order 0. If this
> mfn resides in
> + * a large page, we do not change other pages type within
> that large
> + * page.
> + */
> + paging_mark_dirty(v->domain, mfn_x(mfn));
> + p2m_change_type(p2m, gfn, p2m_ram_logdirty, p2m_ram_rw);
> + return 1;
> + }
> +
> + /* Shouldn't happen: Maybe the guest was writing to a r/o
> grant mapping? */
> + if ( p2mt == p2m_grant_map_ro )
> + {
> + gdprintk(XENLOG_WARNING,
> + "trying to write to read-only grant mapping\n");
> + guest_fault = 1;
> + goto check_access_handler;
> + }
> + } /* end access_w */
> +
> + check_access_handler:
> + /* Even if it is the guest's "fault", check with the mem_event
> interface instead if
> + * one is there */
> + if ( p2m_mem_access_check(gpa, gla_valid, gla, access_r,
> access_w, access_x) )
> + return 1;
p2m_mem_access_check() appears to _always_ return 1. Was that the
intention?
> + /* If there is no handler, then fault if guest_fault = 1 */
> + if ( guest_fault )
> {
> - gdprintk(XENLOG_WARNING,
> - "trying to write to read-only grant mapping\n");
> hvm_inject_exception(TRAP_gp_fault, 0, 0);
> return 1;
> }
>
> + /* Nothing handled it: it must be an access error with no memory
> handler, so fail */
> return 0;
> }
>
> diff -r a3cec4b94150 -r 06b0916eb91d xen/arch/x86/hvm/svm/svm.c
> --- a/xen/arch/x86/hvm/svm/svm.c Tue Jan 04 10:30:15 2011 -0800
> +++ b/xen/arch/x86/hvm/svm/svm.c Tue Jan 04 11:59:48 2011 -0800
> @@ -979,7 +979,7 @@
> __trace_var(TRC_HVM_NPF, 0, sizeof(_d), &_d);
> }
>
> - if ( hvm_hap_nested_page_fault(gfn) )
> + if ( hvm_hap_nested_page_fault(gpa, 0, ~0ull, 0, 0, 0) )
Surely this totally breaks the AMD NPT case -- you need to get the
access-type arguments right or else not rely on them in
hvm_hap_nested_page_fault().
[...]
> diff -r a3cec4b94150 -r 06b0916eb91d xen/arch/x86/mm/mem_access.c
> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> +++ b/xen/arch/x86/mm/mem_access.c Tue Jan 04 11:59:48 2011 -0800
> @@ -0,0 +1,59 @@
> +/******************************************************************************
> + * arch/x86/mm/mem_access.c
> + *
> + * Memory access support.
> + *
> + * Copyright (c) 2009 Citrix Systems, Inc.
Eh, probably not. :)
[...]
> diff -r a3cec4b94150 -r 06b0916eb91d xen/arch/x86/mm/p2m.c
> --- a/xen/arch/x86/mm/p2m.c Tue Jan 04 10:30:15 2011 -0800
> +++ b/xen/arch/x86/mm/p2m.c Tue Jan 04 11:59:48 2011 -0800
> @@ -2853,6 +2853,98 @@
> }
> #endif /* __x86_64__ */
>
> +int p2m_mem_access_check(unsigned long gpa, bool_t gla_valid,
> unsigned long gla,
> + bool_t access_r, bool_t access_w, bool_t access_x)
> +{
> + struct vcpu *v = current;
> + mem_event_request_t req;
> + unsigned long gfn = gpa >> PAGE_SHIFT;
> + struct domain *d = v->domain;
> + struct p2m_domain* p2m = p2m_get_hostp2m(d);
> + int res;
> + mfn_t mfn;
> + p2m_type_t p2mt;
> + p2m_access_t p2ma;
> +
> + /* First, handle rx2rw conversion automatically */
> + mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, p2m_query);
> +
> + if ( access_w && p2ma == p2m_access_rx2rw ) {
> + p2m_lock(p2m);
> + p2m->set_entry(p2m, gfn, mfn, 0, p2mt, p2m_access_rw);
Might be better to use p2m_change_type here; it's atomic so avoids
potential races.
> + p2m_unlock(p2m);
> +
> + return 1; /* handled */
> + }
> +
> + /* Otherwise, check if there is a memory event listener, and send
> the message along */
> + res = mem_event_check_ring(d);
> + if ( res < 0 )
> + {
> + /* No listener */
> + if ( p2m->access_required )
> + {
> + printk(XENLOG_INFO
> + "Memory access permissions failure, no mem_event
> listener: pausing VCPU %d, dom %d\n",
> + v->vcpu_id, d->domain_id);
> +
> + /* Will pause the VCPU */
> + (void) mem_event_check_listener(d);
Why does this need a special case? Could you just post the violation
to the ring and pause as normal, and then if a listener ever arrives it
can handle it?
In fact, I don't see why access_required needs to be a separate flag at
all - it's implied by setting the access permissions on a page or
setting the default to anything other than rwx. That would address
Keir's concern about adding a separate domcrf flag.
> + }
> + else
> + {
> + /* A listener is not required, so clear the access restrictions
> */
> + p2m_lock(p2m);
> + p2m->set_entry(p2m, gfn, mfn, 0, p2mt, p2m_access_rwx);
> + p2m_unlock(p2m);
> + }
> +
> + return 1;
> + }
> + else if ( res > 0 )
> + return 1; /* No space in buffer */
DYM "return 0" here? I think this function needs a comment describing
what the return value means.
[...]
> diff -r a3cec4b94150 -r 06b0916eb91d xen/include/asm-x86/mem_access.h
> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> +++ b/xen/include/asm-x86/mem_access.h Tue Jan 04 11:59:48 2011 -0800
> @@ -0,0 +1,35 @@
> +/******************************************************************************
> + * include/asm-x86/mem_paging.h
> + *
> + * Memory access support.
> + *
> + * Copyright (c) 2009 Citrix Systems, Inc.
Again, no.
[...]
> diff -r a3cec4b94150 -r 06b0916eb91d tools/xenpaging/xenpaging.c
> --- a/tools/xenpaging/xenpaging.c Tue Jan 04 10:30:15 2011 -0800
> +++ b/tools/xenpaging/xenpaging.c Tue Jan 04 11:59:48 2011 -0800
> @@ -658,7 +658,7 @@
> {
> DPRINTF("page already populated (domain = %d; vcpu = %d;"
> " p2mt = %x;"
> - " gfn = %"PRIx64"; paused = %"PRId64")\n",
> + " gfn = %"PRIx64"; paused = %d)\n",
> paging->mem_event.domain_id, req.vcpu_id,
> req.p2mt,
> req.gfn, req.flags & MEM_EVENT_FLAG_VCPU_PAUSED);
>
this belongs in a separate patch; it's not part of the change described
at the top.
Cheers,
Tim.
--
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|