This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel][PATCH] netback: netif cleanup

>>> On 12/7/2009 at  3:04 AM, in message 
>>> <C7426771.3BA0%keir.fraser@xxxxxxxxxxxxx>,
Keir Fraser <keir.fraser@xxxxxxxxxxxxx> wrote: 
> On 06/12/2009 23:15, "Ky Srinivasan" <ksrinivasan@xxxxxxxxxx> wrote:
>>> This looks like a hack. I think you'd get most of the benefit by simply
>>> always calling net_tx_action_dealloc() at the top of net_tx_action() (i.e.,
>>> remove the dealloc_cons!=dealloc_prod conditional). I think the presence of
>>> that conditional is actually a bug, and removing it should get interface
>>> shutdowns to a reasonable time with no other changes.
>> That was my first impression as well and was the first patch we tested. It
>> does not work. Take the case where we have only one netif that netback is
>> handling that gets into this weird state, then we would never call
>> net_tx_action_dealloc() if the carrier is turned off on the netif that is
>> being cleaned up. This patch, fixes that problem by forcefully scheduling 
> the
>> netif whose carrier has been turned off.
> I think that this is simply another bug. There is an early 'return' in
> net_tx_action() which should actually goto the end of the function which
> checks the pending list and conditionally calls mod_timer(). Otherwise the
> pending list can get partially flushed but the timer doesn't get re-mod()ed.

I suspect you are referring to the conditional early return ( if 
(mop==tx_map_ops)) . Anna Fischer of HP has a test environment where we can see 
this problem most easily. I will give her a netback with these changes and 
report back on the cleanup times.


K. Y
>> In the interest of minimizing the
>> code changes, I went about making these changes one at a time, and the
>> combination of what I have in the patch appears to reduce the cleanup time 
> the
>> most. Even with these hacks, I am not particularly happy about the maximum
>> time we have seen the cleanup take, and for this duration the physical host 
> is
>> unusable for launching new guests. So, as I noted in my earlier email, we 
> may
>> still want to not tie up the xenbus thread for the duration of this cleanup.
> Perhaps we could decouple the whole thing from xenbus thread. In any case,
> the above bug fixes are to me preferable to the original patch, and will get
> the orders-of-magnitude low-hanging fruit (delay xenbus thread for maybe a
> second, rather than hours!).
>  -- Keir

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>