WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel][PV-ops][PATCH] Netback: Fix PV network issue for netback

On Thu, 2010-07-01 at 16:29 +0100, Jeremy Fitzhardinge wrote:
> On 07/01/2010 04:48 PM, Ian Campbell wrote:
> > On Mon, 2010-06-21 at 12:14 +0100, Jeremy Fitzhardinge wrote:
> >   
> >> Subject: [PATCH] xen/netback: make sure all the group structures are 
> >> initialized before starting async code
> >>
> >> Split the netbk group array initialization into two phases: one to do
> >> simple "can't fail" initialization of lists, timers, locks, etc; and a
> >> second pass to allocate memory and start the async code.
> >>
> >> This makes sure that all the basic stuff is in place by the time the async 
> >> code
> >> is running.
> >>     
> > Paul noticed a crash in netback init because...
> >
> >   
> >> @@ -1764,16 +1766,6 @@ static int __init netback_init(void)
> >>            netbk->netbk_tx_pending_timer.function =
> >>                    netbk_tx_pending_timeout;
> >>  
> >> -          netbk->mmap_pages =
> >> -                  alloc_empty_pages_and_pagevec(MAX_PENDING_REQS);
> >> -          if (!netbk->mmap_pages) {
> >> -                  printk(KERN_ALERT "%s: out of memory\n", __func__);
> >> -                  del_timer(&netbk->netbk_tx_pending_timer);
> >> -                  del_timer(&netbk->net_timer);
> >> -                  rc = -ENOMEM;
> >> -                  goto failed_init;
> >> -          }
> >> -
> >>            for (i = 0; i < MAX_PENDING_REQS; i++) {
> >>                    page = netbk->mmap_pages[i];
> >>                    SetPageForeign(page, netif_page_release);
> >>     
> > ...this dereference of netbk->mmap_pages[i]...
> >
> >   
> >> @@ -1786,6 +1778,26 @@ static int __init netback_init(void)
> >>     
> > [...]
> >   
> >> +          netbk->mmap_pages =
> >> +                  alloc_empty_pages_and_pagevec(MAX_PENDING_REQS);
> >>     
> > ... happens before this initialisation of the array.
> >   
> 
> Hm, I hadn't meant to commit that properly.  I had it locally and
> accidentally pushed it out.
> 
> I only did that patch as an RFC in response to an issue alluded to by
> Dongxiao (or was it you?) about things not being fully initialized by
> the time the async code starts.  Is this a real issue, and if so, what's
> the correct fix?

I don't think there is an actual current issue, just a potential one
since we are relying on data structures being zeroed rather than
properly initialised to keep the async code from running off into the
weeds, it just seemed a little fragile this way.

Originally I said:
>> The crash is in one of the calls to list_move_tail and I think it is
>> because netbk->pending_inuse_head not being initialised until after
>> the 
>> threads and/or tasklets are created (I was running in threaded mode).
>> Perhaps even though we are now zeroing the netbk struct those fields
>> should still be initialised before kicking off any potentially
>> asynchronous tasks?

this specific issue was fixed by zeroing the netbk array as it is
allocated, I just thought we could make things more robust by not
triggering the async code until everything was fully setup.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel