Re: [Xen-ia64-devel][PATCH] Enable SMP on VTI domain.
On Thu, Jun 29, 2006 at 02:21:48PM +0800, Xu, Anthony wrote:
> >From: Isaku Yamahata [mailto:yamahata@xxxxxxxxxxxxx]
> >Sent: 2006?6?21? 17:25
> >To: Xu, Anthony
> >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
> >Subject: Re: [Xen-ia64-devel][PATCH] Enable SMP on VTI domain.
> >On Thu, Jun 01, 2006 at 12:55:08PM +0800, Xu, Anthony wrote:
> >> >> Or you mean the protection of global purge.
> >> >> When a vcpu get IPI to purge TLB,
> >> >> What it does is to invalid the TLB entry in VHPT,
> >> >> but not remove the TLB entry.
> >> >> There is no race condition.
> >> >
> >> >Is there any gurantee that the vcpu which recives IPI isn't touching VHPT?
> >> The vcpu which receives IPI can touch VHPT in the same time.
> >> Because purge operation only sets the TLB entry invalid, like entry->ti=1.
> >> That has the same philosophy with Tristan's direct purge
> >Could you review the two attached patches?
> >Purge function traverses the collision chain when IPI is sent.
> >But there is a window when the assumption of the collision chain
> >is broken.
> >vmx_hpw_miss() has a race. ia64_do_page_fault() had a similar race before.
> Sorry for late response.
> The second patch is good cleanup and improvement.
The second patch is also a bug fix patch.
> I don't understand the race condition the first patch fixes.
> Could you please elaborate this?
The patch fixes two races.
- a race in vmx_process() of vmx_process.c
The same race was there in ia64_do_page_fault() before.
The check, (fault == IA64_USE_TLB && !current->arch.dtlb.pte.p), in
ia64_do_page_fault() avoids this race.
- a race in vtlb.c
vmx_vcpu_ptc_l() needs a certain condition on the collision chain
to traverse collision chains and purge entries correctly.
But there are windows when the condision is broken.
The critical areas are surrounded by local_irq_save() and local_irq_restore()
with the patch.
If a vcpu send a IPI to another vcpu for ptc.ga when another vcpu is
in the critical area, things goes bad.
Actually the patch assumes that the targeted vcpu is still
on the physical cpu which received IPI.
It might be reasonable to assume so before credit scheduler was
If the targeted vcpu moved to another physical cpu,
the collision chain is traversed on the IPI'ed physical cpu and
the collision chain is modified on the physical cpu on which the
targeted vcpu runs at the same time.
The collision chain modification/traverse code doesn't seem to
be lock-free. Something bad would happen.
In fact I suspect that there still remains a problem.
I haven't checked the generated assembler code and I'm not sure though.
while (proc != v->processor);
v->processor's type is int.
However v->processor might be changed asynchronously by vcpu scheduler
on another physical cpu.
Compiler barrier and memory barrier might be needed somewhere.
Xen-ia64-devel mailing list