WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel][Pv-ops][PATCH 0/3] Resend: Netback multiple thread suppo

To: Steven Smith <steven.smith@xxxxxxxxxx>
Subject: RE: [Xen-devel][Pv-ops][PATCH 0/3] Resend: Netback multiple thread support
From: "Xu, Dongxiao" <dongxiao.xu@xxxxxxxxx>
Date: Wed, 28 Apr 2010 18:27:48 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: Fitzhardinge <jeremy@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Jeremy
Delivery-date: Wed, 28 Apr 2010 03:29:30 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20100427104925.GA14523@xxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4FA716B1526C7C4DB0375C6DADBC4EA342A7A7E95E@xxxxxxxxxxxxxxxxxxxxxxxxx> <EADF0A36011179459010BDF5142A457501D11C1BE3@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B182D87.6030901@xxxxxxxx> <EADF0A36011179459010BDF5142A457501D11C20F8@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B187513.80003@xxxxxxxx> <EADF0A36011179459010BDF5142A457501D13FDE62@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B200727.8040000@xxxxxxxx> <EADF0A36011179459010BDF5142A457501D13FE3BB@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4B213766.7030201@xxxxxxxx> <D5AB6E638E5A3E4B8F4406B113A5A19A1D8CC03F@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <20100427104925.GA14523@xxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acrl91HgQ1AcATYIRnezbvrOzdTqqwAwfo9w
Thread-topic: [Xen-devel][Pv-ops][PATCH 0/3] Resend: Netback multiple thread support
Hi Steven,

Thanks for your careful review. Some explaination inline. 
For your consideration of group and interrupt affinity, I will reply in another
thread. 

Thanks,
Dongxiao

Steven Smith wrote:
>> I'd like to make an update on these patches. The main logic is not
>> changed, and I only did a rebase towards the upstream pv-ops kernel.
>> See attached patch. The original patches are checked in Jeremy's
>> netback-tasklet branch.
> I have a couple of (quite minor) comments on the patches:
> 
> 0001-Netback-Generilize-static-global-variables-into-stru.txt:
>> diff --git a/drivers/xen/netback/netback.c
>> b/drivers/xen/netback/netback.c index c24debf..a484b0a 100644 ---
>> a/drivers/xen/netback/netback.c +++ b/drivers/xen/netback/netback.c
>> @@ -49,18 +49,13 @@
>> 
>>  /*define NETBE_DEBUG_INTERRUPT*/
>> 
>> +struct xen_netbk *xen_netbk;
> 
>> +int group_nr = 1;
>> +struct page_foreign_tracker *foreign_page_tracker;
> I think these two would benefit from more descriptive names, given
> that they're not static.

Oops...I thought I had modified this when Jeremy commented this
last time, maybe there was some mistake and I left it until today...
I swear I will change this in next version of patchset. ;-)

> 
> 
> 
> If I was feeling pedantic, I'd complain that this includes some bits
> of support for multiple struct xen_netbks, rather than just moving all
> of the fields around, which reduces its obviously-correct-ness quite a
> bit.

Actually I was struggling how to split the first patch into small ones,
however I don't have much idea since the patch changes the function
Interface/data structure, so the corresponding caller needs change too,
which generates a 1k line of patch...

> 
> Even more pedantically, it might be better to pass around a struct
> xen_netbk in a few places, rather than an int group, so that you get
> better compiler type checking.

I will change this in next version of patch.

> 
> 
> 
> 0002-Netback-Multiple-tasklets-support.txt:
> Design question: you do your balancing by making each tasklet serve
> roughly equal numbers of remote domains.  Would it not have been
> easier to make them serve equal numbers of netfronts?  You could then
> get rid of the domain_entry business completely and just have a count
> of serviced netfronts in each xen_netbk, which might be a bit easier
> to deal with.

According to my understanding, one guest with VNIF driver represented
by one netfront. Is this true? Therefore I don't understand the difference
between "number of domains" and "number of netfronts", I used to thought
they were the same. Please correct me my understanding is wrong.

Actually in the very early stage of this patch, I use a simple method to
identify which group does a netfront belong to, by calculating
(domid % online_cpu_nr()). The advantage is simple, however it may
cause netfront count unbalanced between different groups.

I will try to remove "domain_entry" related code in next version patch.

> 
>> diff --git a/drivers/xen/netback/interface.c
>> b/drivers/xen/netback/interface.c index 21c1f95..bdf3c1d 100644 ---
>> a/drivers/xen/netback/interface.c +++
>> b/drivers/xen/netback/interface.c @@ -54,6 +54,59 @@
>>  static unsigned long netbk_queue_length = 32;
>>  module_param_named(queue_length, netbk_queue_length, ulong, 0644);
>> 
>> 
>> +static void remove_domain_from_list(struct xen_netbk *netbk,
>> +                         struct xen_netif *netif)
>> +{
>> +    struct domain_entry *dom_entry = NULL;
>> +    int group = netif->group;
>> +
>> +    list_for_each_entry(dom_entry,
>> +                    &netbk[group].group_domain_list, dom) {
>> +            if (dom_entry->domid == netif->domid)
>> +                    break;
>> +    }
>> +    if (!dom_entry)
>> +            return;
> Can this ever happen?  If so, you might have issues when several
> netfronts all end in the same frontend domain e.g.:
> 
> -- netfront A arrives and is added to list
> -- netfront B arrives and is added to list
> -- netfront B is removed
> -- netfront B is removed again.  It's no longer in the list,
>    but A is, so A gets removed instead.
> 
> The end result being that netback thinks the group is idle, but it
> actually has netfront A in it.  I guess the worst that can happen is
> that you don't balance across tasklets properly, but it'd still be
> better avoided.
> 
> If it *can't* happen, there should probably be some kind of warning
> when it does.
> 
>> +
>> +    spin_lock(&netbk[netif->group].group_domain_list_lock);
>> +    netbk[netif->group].group_domain_nr--;
>> +    list_del(&dom_entry->dom);
>> +    spin_unlock(&netbk[netif->group].group_domain_list_lock);
>> +    kfree(dom_entry); +}
>> +
>>  static void __netif_up(struct xen_netif *netif)
>>  {
>>      enable_irq(netif->irq);
>> 
>> @@ -70,6 +123,7 @@ static int net_open(struct net_device *dev)  {
>>      struct xen_netif *netif = netdev_priv(dev);
>>      if (netback_carrier_ok(netif)) {
>> +            add_domain_to_list(xen_netbk, group_nr, netif);
>>              __netif_up(netif);
>>              netif_start_queue(dev);
>>      }
>> @@ -79,8 +133,10 @@ static int net_open(struct net_device *dev)
>>  static int net_close(struct net_device *dev)
>>  {
>>      struct xen_netif *netif = netdev_priv(dev);
>> -    if (netback_carrier_ok(netif))
>> +    if (netback_carrier_ok(netif)) {
>>              __netif_down(netif);
>> +            remove_domain_from_list(xen_netbk, netif);
>> +    }
>>      netif_stop_queue(dev);
>>      return 0;
>>  }
> Okay, so if the interface gets down'd and then up'd it'll potentially
> move to a different group.  How much testing has that situation had?
> 
> I'd be tempted to add the interface to the list as soon as it's
> created and then leave it there until it's removed, and not rebalance
> during its lifetime at all, and hence avoid the issue.
> 
>> @@ -1570,6 +1570,7 @@ static int __init netback_init(void)
>>      /* We can increase reservation by this much in net_rx_action(). */
>>  //  balloon_update_driver_allowance(NET_RX_RING_SIZE);
>> 
>> +    group_nr = num_online_cpus();
>>      xen_netbk = kzalloc(group_nr * sizeof(struct xen_netbk),
>>              GFP_KERNEL);    if (!xen_netbk) { printk(KERN_ALERT "%s: out of
>> memory\n", __func__); 
> What happens if the number of online CPUs changes while netback is
> running?  In particular, do we stop trying to send a tasklet/thread to
> a CPU which has been offlined?

The group_nr just defines the max number of tasklets, however it doesn't decide
which tasklet is handled by which CPU. It is decided by the delivery of 
interrupt. 

> 
> 
> 0003-Use-Kernel-thread-to-replace-the-tasklet.txt:
>> diff --git a/drivers/xen/netback/netback.c
>> b/drivers/xen/netback/netback.c index 773cd4f..8b55efc 100644 ---
>> a/drivers/xen/netback/netback.c +++ b/drivers/xen/netback/netback.c
>> +static int netbk_action_thread(void *index)
>> +{
>> +    int group = (long)index;
>> +    struct xen_netbk *netbk = &xen_netbk[group];
>> +    while (1) {
>> +            wait_event_interruptible(netbk->netbk_action_wq,
>> +                            rx_work_todo(group) +                           
>> || tx_work_todo(group));
>> +            cond_resched();
>> +
>> +            if (rx_work_todo(group))
>> +                    net_rx_action(group);
>> +
>> +            if (tx_work_todo(group))
>> +                    net_tx_action(group);
>> +    }
> Hmm... You use kthread_stop() on this thread, so you should probably
> test kthread_should_stop() every so often.

OK, I will modify it.

> 
>> +
>> +    return 0;
>> +}
>> +
>> +
>>  static int __init netback_init(void)
>>  {
>>      int i;
> 
> 
> Apart from that, it all looks fine to me.
> 
> Steven.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>