Steven Smith wrote:
>>>> I'd like to make an update on these patches. The main logic is not
>>>> changed, and I only did a rebase towards the upstream pv-ops
>>>> kernel. See attached patch. The original patches are checked in
>>>> Jeremy's netback-tasklet branch.
>>> I have a couple of (quite minor) comments on the patches:
>>>
>>> 0001-Netback-Generilize-static-global-variables-into-stru.txt:
>>>> diff --git a/drivers/xen/netback/netback.c
>>>> b/drivers/xen/netback/netback.c index c24debf..a484b0a 100644 ---
>>>> a/drivers/xen/netback/netback.c +++
>>>> b/drivers/xen/netback/netback.c @@ -49,18 +49,13 @@
>>>>
>>>> /*define NETBE_DEBUG_INTERRUPT*/
>>>>
>>>> +struct xen_netbk *xen_netbk;
>>>
>>>> +int group_nr = 1;
>>>> +struct page_foreign_tracker *foreign_page_tracker;
>>> I think these two would benefit from more descriptive names, given
>>> that they're not static.
>> Oops...I thought I had modified this when Jeremy commented this
>> last time, maybe there was some mistake and I left it until today...
>> Easily done.
>
>>> If I was feeling pedantic, I'd complain that this includes some bits
>>> of support for multiple struct xen_netbks, rather than just moving
>>> all of the fields around, which reduces its obviously-correct-ness
>>> quite a bit.
>> Actually I was struggling how to split the first patch into small
>> ones,
>> however I don't have much idea since the patch changes the function
>> Interface/data structure, so the corresponding caller needs change
>> too,
>> which generates a 1k line of patch...
> The approach I would take would be something like this:
>
> 1) Gather all the global data together into a struct xen_netbk, and
> then have a single, global, instance of that structure. Go through
> and turn every reference to a bit of global data into a reference
> to a field of that structure. This will be a massive patch, but
> it's purely mechanical and it's very easy to check that it's safe.
>
> 2) Introduce struct ext_page and use it everywhere you use it in the
> current patch. This should be fairly small.
>
> 3) Generalise to multiple struct xen_netbks by changing the single
> global instance into a struct xen_netbk * and fixing the resulting
> compile errors. Another big patch, but provided you remember to
> initialise everything properly the compiler will check almost all
> of it for you.
>
> This is to some extent a bikeshed argument, so if you prefer the
> current patch structure then that would work as well.
Thanks for your suggestion, I will have a try on this.
>
>>> Even more pedantically, it might be better to pass around a struct
>>> xen_netbk in a few places, rather than an int group, so that you get
>>> better compiler type checking.
>> I will change this in next version of patch. Thanks.
>
>>> 0002-Netback-Multiple-tasklets-support.txt:
>>> Design question: you do your balancing by making each tasklet serve
>>> roughly equal numbers of remote domains. Would it not have been
>>> easier to make them serve equal numbers of netfronts? You could
>>> then get rid of the domain_entry business completely and just have
>>> a count of serviced netfronts in each xen_netbk, which might be a
>>> bit easier to deal with.
>> According to my understanding, one guest with VNIF driver represented
>> by one netfront. Is this true? Therefore I don't understand the
>> difference between "number of domains" and "number of netfronts", I
>> used to thought
>> they were the same. Please correct me my understanding is wrong.
> I think we might be using slightly different terminology here. When I
> say ``netfront'', I mean the frontend half of a virtual network
> interface, rather than the netfront driver, so a single domain can be
> configured with multiple netfronts in the same way that a single
> physical host can have multiple ixgbes (say), despite only having one
> ixgbe driver loaded.
>
> So, my original point was that it might be better to balance
> interfaces such that the number of interfaces in each group is the
> same, ignoring the frontend domain ID completely. This would mean
> that if, for instance, a domain had two very busy NICs then they
> wouldn't be forced to share a tasklet, which might otherwise be a
> bottleneck.
>
> The downside, of course, is that it would allow domains with multiple
> vNICs to use more dom0 CPU time, potentially aggravating starvation
> and unfairness problems. On the other hand, a domain with N vNICs
> wouldn't be able to do any more damage than N domains with 1 vNIC
> each, so I don't think it's too bad.
It's my mis-understanding previously, thanks for explaination on this point.
Regards,
Dongxiao
>
>> Actually in the very early stage of this patch, I use a simple
>> method to
>> identify which group does a netfront belong to, by calculating
>> (domid % online_cpu_nr()). The advantage is simple, however it may
>> cause netfront count unbalanced between different groups.
> Well, any static scheme will potentially come unbalanced some of the
> time, if different interfaces experience different levels of traffic.
> But see the other thread for another discussion of balancing issues.
>
>> I will try to remove "domain_entry" related code in next version
>> patch. Thanks.
>
>>>> @@ -1570,6 +1570,7 @@ static int __init netback_init(void)
>>>> /* We can increase reservation by this much in net_rx_action().
>>>> */ // balloon_update_driver_allowance(NET_RX_RING_SIZE);
>>>>
>>>> + group_nr = num_online_cpus();
>>>> xen_netbk = kzalloc(group_nr * sizeof(struct xen_netbk),
>>>> GFP_KERNEL); if (!xen_netbk) { printk(KERN_ALERT "%s: out of
>>>> memory\n", __func__);
>>> What happens if the number of online CPUs changes while netback is
>>> running? In particular, do we stop trying to send a tasklet/thread
>>> to a CPU which has been offlined?
>> The group_nr just defines the max number of tasklets, however it
>> doesn't decide which tasklet is handled by which CPU. It is decided
>> by the delivery of interrupt.
> Ah, yes, so it is. Thanks for explaining it.
>
>>> 0003-Use-Kernel-thread-to-replace-the-tasklet.txt:
>>>> diff --git a/drivers/xen/netback/netback.c
>>>> b/drivers/xen/netback/netback.c index 773cd4f..8b55efc 100644 ---
>>>> a/drivers/xen/netback/netback.c +++ b/drivers/xen/netback/netback.c
>>>> +static int netbk_action_thread(void *index)
>>>> +{
>>>> + int group = (long)index;
>>>> + struct xen_netbk *netbk = &xen_netbk[group];
>>>> + while (1) {
>>>> + wait_event_interruptible(netbk->netbk_action_wq,
>>>> + rx_work_todo(group) +
>>>> || tx_work_todo(group));
>>>> + cond_resched(); +
>>>> + if (rx_work_todo(group))
>>>> + net_rx_action(group);
>>>> +
>>>> + if (tx_work_todo(group))
>>>> + net_tx_action(group);
>>>> + }
>>> Hmm... You use kthread_stop() on this thread, so you should probably
>>> test kthread_should_stop() every so often.
>> OK, I will modify it.
> Thanks.
>
> Steven.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|