WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] netfront.c: gnttab_query_foreign_access returns nonzero

To: "Kirk Allan" <kallan@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] netfront.c: gnttab_query_foreign_access returns nonzero in network_tx_buf_gc
From: "Steven Hand" <steven.hand@xxxxxxxxxxxx>
Date: Thu, 25 May 2006 17:37:58 +0100
Delivery-date: Thu, 25 May 2006 09:36:27 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <447574C5.39DB.0076.0@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

I've been working form the netfront.c in the testing tree and using SLES
10 RC1 for i386 on a SMP box.  When I stress the network using iperf in
a domU, domU acting as client on a gigabit network, I occasionally get a
panic at the dev_kfree_skb_irq(skb); line.  This is the same panic as
reported in
http://lists.xensource.com/archives/html/xen-devel/2006-05/msg00919.html

The trace  indicates that the skb is bad and it looks like the skb is
an id.  Investigating further, the condition occurs if the
gnttab_query_foreign_access returns non zero on a second or latter
iteration through the for loop.  If it return non zero, the the code
takes the 'goto out' which by passes fixing up  np->tx.rsp_cons.  Then
the next time in network_tx_buf_gc we reuse  np->tx.rsp_cons which is at
the location of a previously completed skb and the skb gets an id and
not a skb.

Looking at the unstable tree, the goto has been removed and replaced
with a break.  However, it looks like if gnttab_query_foreign_access
returns non zero between np->tx.rsp_cons and prod, then the
np->tx.rsp_cons = prod; could advance  np->tx.rsp_cons too far causing
other problems latter (I have not tested this yet though).

Yes, this definitely looks like a bug; the 'break' in -unstable is not really much better than the 'goto out:' in -testing since in either case we can't easily correctly recover.

The problem I'm having is that I can't find the root cause as to why
gnttab_query_foreign_access returns an 8 (GTF_reading?) and not 0.  I've
looked in netback.c and and xen/common/grant_table.c and am not seeing
it (not that it's not there).

Well all this means is that netback is still using the grant which should of course be impossible since the ring pointers have been advanced. I.e. something is borked.

Can you try this with a debug build of xen? It would be interesting to see if xen
complains about any grant refs prior to this occurance...


cheers,

S.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel