> I'd like to make an update on these patches. The main logic is not
> changed, and I only did a rebase towards the upstream pv-ops kernel.
> See attached patch. The original patches are checked in Jeremy's
> netback-tasklet branch.
I have a couple of (quite minor) comments on the patches:
0001-Netback-Generilize-static-global-variables-into-stru.txt:
> diff --git a/drivers/xen/netback/netback.c b/drivers/xen/netback/netback.c
> index c24debf..a484b0a 100644
> --- a/drivers/xen/netback/netback.c
> +++ b/drivers/xen/netback/netback.c
> @@ -49,18 +49,13 @@
>
> /*define NETBE_DEBUG_INTERRUPT*/
>
> +struct xen_netbk *xen_netbk;
> +int group_nr = 1;
> +struct page_foreign_tracker *foreign_page_tracker;
I think these two would benefit from more descriptive names, given
that they're not static.
If I was feeling pedantic, I'd complain that this includes some bits
of support for multiple struct xen_netbks, rather than just moving all
of the fields around, which reduces its obviously-correct-ness quite a
bit.
Even more pedantically, it might be better to pass around a struct
xen_netbk in a few places, rather than an int group, so that you get
better compiler type checking.
0002-Netback-Multiple-tasklets-support.txt:
Design question: you do your balancing by making each tasklet serve
roughly equal numbers of remote domains. Would it not have been
easier to make them serve equal numbers of netfronts? You could then
get rid of the domain_entry business completely and just have a count
of serviced netfronts in each xen_netbk, which might be a bit easier
to deal with.
> diff --git a/drivers/xen/netback/interface.c b/drivers/xen/netback/interface.c
> index 21c1f95..bdf3c1d 100644
> --- a/drivers/xen/netback/interface.c
> +++ b/drivers/xen/netback/interface.c
> @@ -54,6 +54,59 @@
> static unsigned long netbk_queue_length = 32;
> module_param_named(queue_length, netbk_queue_length, ulong, 0644);
>
>
> +static void remove_domain_from_list(struct xen_netbk *netbk,
> + struct xen_netif *netif)
> +{
> + struct domain_entry *dom_entry = NULL;
> + int group = netif->group;
> +
> + list_for_each_entry(dom_entry,
> + &netbk[group].group_domain_list, dom) {
> + if (dom_entry->domid == netif->domid)
> + break;
> + }
> + if (!dom_entry)
> + return;
Can this ever happen? If so, you might have issues when several
netfronts all end in the same frontend domain e.g.:
-- netfront A arrives and is added to list
-- netfront B arrives and is added to list
-- netfront B is removed
-- netfront B is removed again. It's no longer in the list,
but A is, so A gets removed instead.
The end result being that netback thinks the group is idle, but it
actually has netfront A in it. I guess the worst that can happen is
that you don't balance across tasklets properly, but it'd still be
better avoided.
If it *can't* happen, there should probably be some kind of warning
when it does.
> +
> + spin_lock(&netbk[netif->group].group_domain_list_lock);
> + netbk[netif->group].group_domain_nr--;
> + list_del(&dom_entry->dom);
> + spin_unlock(&netbk[netif->group].group_domain_list_lock);
> + kfree(dom_entry);
> +}
> +
> static void __netif_up(struct xen_netif *netif)
> {
> enable_irq(netif->irq);
>
> @@ -70,6 +123,7 @@ static int net_open(struct net_device *dev)
> {
> struct xen_netif *netif = netdev_priv(dev);
> if (netback_carrier_ok(netif)) {
> + add_domain_to_list(xen_netbk, group_nr, netif);
> __netif_up(netif);
> netif_start_queue(dev);
> }
> @@ -79,8 +133,10 @@ static int net_open(struct net_device *dev)
> static int net_close(struct net_device *dev)
> {
> struct xen_netif *netif = netdev_priv(dev);
> - if (netback_carrier_ok(netif))
> + if (netback_carrier_ok(netif)) {
> __netif_down(netif);
> + remove_domain_from_list(xen_netbk, netif);
> + }
> netif_stop_queue(dev);
> return 0;
> }
Okay, so if the interface gets down'd and then up'd it'll potentially
move to a different group. How much testing has that situation had?
I'd be tempted to add the interface to the list as soon as it's
created and then leave it there until it's removed, and not rebalance
during its lifetime at all, and hence avoid the issue.
> @@ -1570,6 +1570,7 @@ static int __init netback_init(void)
> /* We can increase reservation by this much in net_rx_action(). */
> // balloon_update_driver_allowance(NET_RX_RING_SIZE);
>
> + group_nr = num_online_cpus();
> xen_netbk = kzalloc(group_nr * sizeof(struct xen_netbk), GFP_KERNEL);
> if (!xen_netbk) {
> printk(KERN_ALERT "%s: out of memory\n", __func__);
What happens if the number of online CPUs changes while netback is
running? In particular, do we stop trying to send a tasklet/thread to
a CPU which has been offlined?
0003-Use-Kernel-thread-to-replace-the-tasklet.txt:
> diff --git a/drivers/xen/netback/netback.c b/drivers/xen/netback/netback.c
> index 773cd4f..8b55efc 100644
> --- a/drivers/xen/netback/netback.c
> +++ b/drivers/xen/netback/netback.c
> +static int netbk_action_thread(void *index)
> +{
> + int group = (long)index;
> + struct xen_netbk *netbk = &xen_netbk[group];
> + while (1) {
> + wait_event_interruptible(netbk->netbk_action_wq,
> + rx_work_todo(group)
> + || tx_work_todo(group));
> + cond_resched();
> +
> + if (rx_work_todo(group))
> + net_rx_action(group);
> +
> + if (tx_work_todo(group))
> + net_tx_action(group);
> + }
Hmm... You use kthread_stop() on this thread, so you should probably
test kthread_should_stop() every so often.
> +
> + return 0;
> +}
> +
> +
> static int __init netback_init(void)
> {
> int i;
Apart from that, it all looks fine to me.
Steven.
signature.asc
Description: Digital signature
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|