WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: [PATCH R3 7/7] xen/balloon: Memory hotplug support for X

On Thu, Feb 10, 2011 at 11:53:16AM -0500, Konrad Rzeszutek Wilk wrote:
> On Thu, Feb 03, 2011 at 05:30:33PM +0100, Daniel Kiper wrote:

[...]

> > +static struct resource *adjust_memory_resource(struct resource *r, 
> > unsigned long nr_pages)
> > +{
> > +   int rc;
> > +
> > +   if (r->end + 1 - (nr_pages << PAGE_SHIFT) == r->start) {
>
> Will this actually occur? Say I called 'allocate_additional_memory' with 512
> and got -ENOMEM (so the hypercall failed complelty). The mh->policy 
> MH_POLICY_STOP_AT_FIRST_ERROR. So we end up here. Assume the r_min is
> 0x100000000, then r->start is 0x100000000 and r->end is 0x100200000.
>
> So:
>   100200001 - (200000) == 0x100000000 ?

r->end points always to last byte in currently allocated resource.
It means that: r->end == r->start + size - 1

> > +           rc = release_resource(r);
> > +           BUG_ON(rc < 0);
> > +           kfree(r);
> > +           return NULL;
> > +   }
> > +
> > +   rc = adjust_resource(r, r->start, r->end + 1 - r->start -
> > +                           (nr_pages << PAGE_SHIFT));
>
> If we wanted 512 pages, and only got 256 and want to adjust the region, we
> would want it be:
> 0x100000000 -> 0x100100000 right?
>
> So with the third argument that comes out to be:
>
> 0x100200000 + 1 - 0x100000000 - (100000) = 100001
>
> which is just one page above what we requested?

Please, look above.

> > +
> > +   BUG_ON(rc < 0);
>
> Can we just do WARN_ON, and return NULL instead (and also release the 
> resource)?

I will rethink that once again.

> > +
> > +   return r;
> > +}
> > +
> > +static int allocate_additional_memory(struct resource *r, unsigned long 
> > nr_pages)
> > +{
> > +   int rc;
> > +   struct xen_memory_reservation reservation = {
> > +           .address_bits = 0,
> > +           .extent_order = 0,
> > +           .domid        = DOMID_SELF
> > +   };
> > +   unsigned long flags, i, pfn, pfn_start;
> > +
> > +   if (!nr_pages)
> > +           return 0;
> > +
> > +   pfn_start = PFN_UP(r->end) - nr_pages;
> > +
> > +   if (nr_pages > ARRAY_SIZE(frame_list))
> > +           nr_pages = ARRAY_SIZE(frame_list);
> > +
> > +   for (i = 0, pfn = pfn_start; i < nr_pages; ++i, ++pfn)
> > +           frame_list[i] = pfn;
> > +
> > +   set_xen_guest_handle(reservation.extent_start, frame_list);
> > +   reservation.nr_extents = nr_pages;
> > +
> > +   spin_lock_irqsave(&xen_reservation_lock, flags);
> > +
> > +   rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
> > +
> > +   if (rc <= 0)
> > +           return (rc < 0) ? rc : -ENOMEM;
> > +
>
> So if we populated some of them (say we want to 512, but only did 64),
> don't we want to do the loop below?  Also you look to be forgetting to
> do a spin_unlock_irqrestore if you quit here.

Loop which you mentioned is skipped only when HYPERVISOR_memory_op()
does not allocate anything (0 pages) or something went wrong and
an error code was returned.

spin_lock_irqsave()/spin_unlock_irqrestore() should be removed
as like it was done in increase_reservation()/decrease_reservation().
I overlooked those calls. Thanks.

> > +   for (i = 0, pfn = pfn_start; i < rc; ++i, ++pfn) {
> > +           BUG_ON(!xen_feature(XENFEAT_auto_translated_physmap) &&
> > +                  phys_to_machine_mapping_valid(pfn));
> > +           set_phys_to_machine(pfn, frame_list[i]);
> > +   }
> > +
> > +   spin_unlock_irqrestore(&xen_reservation_lock, flags);
> > +
> > +   return rc;
> > +}
> > +
> > +static void hotplug_allocated_memory(struct resource *r)
> >  {
> > -   unsigned long target = balloon_stats.target_pages;
> > +   int nid, rc;
> > +   resource_size_t r_size;
> > +   struct memory_block *mem;
> > +   unsigned long pfn;
> > +
> > +   r_size = r->end + 1 - r->start;
>
> Why bump it by one byte?

Please, look above.

> > +   nid = memory_add_physaddr_to_nid(r->start);
> > +
> > +   rc = add_registered_memory(nid, r->start, r_size);
> > +
> > +   if (rc) {
> > +           pr_err("%s: add_registered_memory: Memory hotplug failed: %i\n",
> > +                   __func__, rc);
> > +           balloon_stats.target_pages = balloon_stats.current_pages;
> > +           return;
> > +   }
> > +
> > +   if (xen_pv_domain())
> > +           for (pfn = PFN_DOWN(r->start); pfn < PFN_UP(r->end); ++pfn)
>
> I think you the r->start to be PFN_UP just in case the r->start is not page
> aligned. Thought I am not sure it would even happen anymore, as M A Young
> found the culprit that made it possible for us to setup memory regions
> non-aligned and that is fixed now (in 2.6.38).

r->start is always page aligned because allocate_resource()
always returns page aligned resource (it is forced by arguments).

> > +                   if (!PageHighMem(pfn_to_page(pfn))) {
> > +                           rc = HYPERVISOR_update_va_mapping(
> > +                                   (unsigned long)__va(pfn << PAGE_SHIFT),
> > +                                   mfn_pte(pfn_to_mfn(pfn), PAGE_KERNEL), 
> > 0);
> > +                           BUG_ON(rc);
> > +                   }
> >
> > -   target = min(target,
> > -                balloon_stats.current_pages +
> > -                balloon_stats.balloon_low +
> > -                balloon_stats.balloon_high);
> > +   rc = online_pages(PFN_DOWN(r->start), r_size >> PAGE_SHIFT);
> >
> > -   return target;
> > +   if (rc) {
> > +           pr_err("%s: online_pages: Failed: %i\n", __func__, rc);
> > +           balloon_stats.target_pages = balloon_stats.current_pages;
> > +           return;
> > +   }
> > +
> > +   for (pfn = PFN_DOWN(r->start); pfn < PFN_UP(r->end); pfn += 
> > PAGES_PER_SECTION) {
>
> Ditto. Can you do PFN_UP(r->start)?

Please, look above.

> > +           mem = find_memory_block(__pfn_to_section(pfn));
> > +           BUG_ON(!mem);
> > +           BUG_ON(!present_section_nr(mem->phys_index));
> > +           mutex_lock(&mem->state_mutex);
> > +           mem->state = MEM_ONLINE;
> > +           mutex_unlock(&mem->state_mutex);
> > +   }
> > +
> > +   balloon_stats.current_pages += r_size >> PAGE_SHIFT;
> >  }
> >
> > +static enum bp_state request_additional_memory(long credit)
> > +{
> > +   int rc;
> > +   static struct resource *r;
> > +   static unsigned long pages_left;
> > +
> > +   if ((credit <= 0 || balloon_stats.balloon_low ||
> > +                           balloon_stats.balloon_high) && !r)
> > +           return BP_DONE;
> > +
> > +   if (!r) {
> > +           r = allocate_memory_resource(credit);
> > +
> > +           if (!r)
> > +                   return BP_ERROR;
> > +
> > +           pages_left = credit;
> > +   }
> > +
> > +   rc = allocate_additional_memory(r, pages_left);
> > +
> > +   if (rc < 0) {
> > +           if (balloon_stats.mh_policy == MH_POLICY_TRY_UNTIL_SUCCESS)
> > +                   return BP_ERROR;
> > +
> > +           r = adjust_memory_resource(r, pages_left);
> > +
> > +           if (!r)
> > +                   return BP_ERROR;
>
> Say we failed the hypercall completly and got -ENOMEM from the 
> 'allocate_additional_memory'.
> I presume the adjust_memory_resource at this point would have deleted 'r', 
> which means
> that (!r) and we return BP_ERROR.
>
> But that means that we aren't following the MH_POLICY_STOP_AT_FIRST_ERROR as
> the balloon_process will retry again and the again, and again??

You are right. It should be return BP_DONE.

> > +   } else {
> > +           pages_left -= rc;
> > +
>
> So say I request 512 pages (mh_policy is MH_POLICY_STOP_AT_FIRST_ERROR),
> but only got 256. I adjust the pages_left to be 256 and then
> > +           if (pages_left)
> > +                   return BP_HUNGRY;
>
> we return BP_HUNGRY. That makes 'balloon_process' retry with 512 pages, and we
> keep on trying and call "allocate_additional_memory", which fails once more
> (returns 256), and we end up returning BP_HUNGRY, and retry... and so on.
>
> Would it be make sense to have a check here for the 
> MH_POLICY_STOP_AT_FIRST_ERROR
> and if so call the adjust_memory_memory_resource as well?

Here it is OK. First time allocate_additional_memory() returns 256 which
is OK and next time if more memory is not available then it returns rc < 0
which forces execution of "if (rc < 0) {..." (as it was expected).

Daniel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel