I'll rework the patches incorporating the feedback and resend the
modified patches.
I've got a couple of quick notes, inline below.
>> @@ -135,16 +166,28 @@ void mem_event_put_request(struct domain
>> + /*
>> + * We ensure that each vcpu can put at least *one* event -- because some
>> + * events are not repeatable, such as dropping a page. This will
>> ensure no
>> + * vCPU is left with an event that they must place on the ring, but
>> cannot.
>> + * They will be paused after the event is placed.
>> + * See large comment below in mem_event_unpause_vcpus().
>> + */
>> + if( current->domain->domain_id == d->domain_id &&
>> + mem_event_ring_free(d, med) < d->max_vcpus )
>> + mem_event_mark_and_pause(current, med);
>> +
>
> This idiom of comparing domain-ids cropped up in the earlier mem-event
> patches and seems to be spreading, but the right check is just
> (current->domain == d).
>
> Also: are there cases where current->domain != d? If so, can't those cases
> cause the ring to fill up?
Yes, I believe there are there are cases where a different domain
(i.e. the domain w/ the device model) can map a page generating an
event (mapping a paged-out page, writeable mappings of pages with
non-writable access bits, etc.). Unfortunately, we can't pause any
vcpu in those cases.
This is something that I intend to revisit, as guaranteeing that
absolutely no events are lost may be quite complicated (especially
when there are certain events which are not repeatable). I'm
considering the use of the new wait queues or other mechanisms to wait
for the ring to clear up while in the same context.... but that is
another sort of tricky.
I'm quite glad you pointed this out, because I believe the
mem_event_mark_and_pause following the 'possible lost mem_event for
domain' is incorrect. If this is a different domain generating the
event, this line is very wrong.
>> + for_each_vcpu ( d, v )
>> + {
>> + if ( !(med->blocked) || online >= mem_event_ring_free(d, med) )
>> + break;
>
> Is there a risk that under heavy mem-event loads vcpu 0 might starve
> other vcpus entirely because they're never allowed to unpause here?
Unfortunately, yes. With heavy mem event load (& a dom0 that isn't
taking them off the ring fast enough). I think there are two fair
alternatives:
1) Unpausing a vcpu at random.
2) Waiting until the ring reaches a watermark and unpausing all vCPUs.
Any thoughts on these?
>> + if ( !test_bit(_VPF_mem_event, &v->pause_flags) )
>> + {
>> + set_bit(_VPF_mem_event, &v->pause_flags);
>
> Does this need to be an atomic test-and-set?
I believe that it is currently always called within a locked
(mem_event_ring_lock) context, but it should be atomic test-and-set
anyways in case it's not.
Cheers!
-Adin
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|