At 16:09 +0100 on 12 Aug (1281629364), Jan Beulich wrote:
> >>> On 23.07.10 at 15:49, Tim Deegan <Tim.Deegan@xxxxxxxxxx> wrote:
> > There are a few places in Xen where we walk a domain's page lists
> > without holding the page_alloc lock. They race with updates to the page
> > lists, which are normally rare but can be quite common under PoD when
> > the domain is close to its memory limit and the PoD reclaimer is busy.
> > This patch protects those places by taking the page_alloc lock.
> >
> > I think this is OK for the two debug-key printouts - they don't run from
> > irq context and look deadlock-free. The tboot change seems safe too
>
> While the comment says the patch would leave debug key printouts
> alone, ...
Sorry, my intention was to say that changes to the debug-key printouts
are safe, not that they didn't require changes.
The debug-key printouts (in particular the NUMA one) are where I
actually hit this bug on a running system.
Tim.
> > unless tboot shutdown functions are called from irq context or with the
> > page_alloc lock held. The p2m one is the scariest but there are already
> > code paths in PoD that take the page_alloc lock with the p2m lock held
> > so it's no worse than existing code.
> >
> > Signed-off-by: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
> >
> > diff -r e8dbc1262f52 xen/arch/x86/domain.c
> > --- a/xen/arch/x86/domain.c Wed Jul 21 09:02:10 2010 +0100
> > +++ b/xen/arch/x86/domain.c Fri Jul 23 14:33:22 2010 +0100
> > @@ -139,12 +139,14 @@ void dump_pageframe_info(struct domain *
>
> ... the actual patch still touches a respective function. It would seem
> to me that this part ought to be reverted.
>
> > }
> > else
> > {
> > + spin_lock(&d->page_alloc_lock);
> > page_list_for_each ( page, &d->page_list )
> > {
> > printk(" DomPage %p: caf=%08lx, taf=%" PRtype_info "\n",
> > _p(page_to_mfn(page)),
> > page->count_info, page->u.inuse.type_info);
> > }
> > + spin_unlock(&d->page_alloc_lock);
> > }
> >
> > if ( is_hvm_domain(d) )
> > @@ -152,12 +154,14 @@ void dump_pageframe_info(struct domain *
> > p2m_pod_dump_data(d);
> > }
> >
> > + spin_lock(&d->page_alloc_lock);
> > page_list_for_each ( page, &d->xenpage_list )
> > {
> > printk(" XenPage %p: caf=%08lx, taf=%" PRtype_info "\n",
> > _p(page_to_mfn(page)),
> > page->count_info, page->u.inuse.type_info);
> > }
> > + spin_unlock(&d->page_alloc_lock);
> > }
> >
> > struct domain *alloc_domain_struct(void)
>
> Sorry for not noticing this earlier.
>
> Jan
>
--
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, XenServer Engineering
Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|