xen-devel

[Top] [All Lists]

Re: [Xen-devel] Problem with PV disk and iSCSI

from [Gary Grebus]

[Permanent Link][Original]

To:	Kurt Hackel <kurt.hackel@xxxxxxxxxx>
Subject:	Re: [Xen-devel] Problem with PV disk and iSCSI
From:	Gary Grebus <ggrebus@xxxxxxxxxxxxxxx>
Date:	Mon, 11 Feb 2008 10:13:13 -0500
Cc:	xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date:	Mon, 11 Feb 2008 07:13:43 -0800
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxx
In-reply-to:	<20080209061547.GB14510@xxxxxxxxxxxxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<1202500454.3109.141.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <20080209061547.GB14510@xxxxxxxxxxxxxxxxxxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

On Fri, 2008-02-08 at 22:15 -0800, Kurt Hackel wrote:
> Hi Gary,
> 
> On Fri, Feb 08, 2008 at 02:54:14PM -0500, Gary Grebus wrote:
> > I've run into a problem on 3.1.2 with an HVM guest using PV disks.  In
> > dom0, the physical disk is accessed using iSCSI.  The symptom is that
> > applications in dom0 which are monitoring the iSCSI network interface
> > (e.g. tcpdump) die with EFAULT errors.
> > 
...
> > 
> 
> We're seeing the same thing with 3.1.3.  When running iscsi in dom0
> (over a xen bridge) presenting these via blkfront to the guest we see 
> the same crash (below) while performing failover tests on the storage
> controller.
>
> Just as you said, the error occurs in skb_remove_foreign_references from
> loopback_start_xmit.  It's running all the foreign pages, attempting to
> copy each locally when it dies on the source address (esi) of the
> following memcpy:

That's a different failure than I see, but looks like the same
underlying cause.  Our test used a dedicated iSCSI NIC, so netback
wasn't involved.  I haven't looked at how netback handles the mapped
pages.

> 
> 115                 vaddr = kmap_skb_frag(&skb_shinfo(skb)->frags[i]);
> 116                 off = skb_shinfo(skb)->frags[i].page_offset;
> 117                 memcpy(page_address(page) + off,
> 118                       vaddr + off,
> 119                        skb_shinfo(skb)->frags[i].size);
> 
> c053f2f7:       0f b7 74 c8 18          movzwl 0x18(%eax,%ecx,8),%esi
> c053f2fc:       0f b7 5c c8 1a          movzwl 0x1a(%eax,%ecx,8),%ebx
> c053f301:       8b 44 24 0c             mov    0xc(%esp),%eax
> c053f305:       e8 ba 09 f1 ff          call   0xc044fcc4  page_address
> c053f30a:       89 d9                   mov    %ebx,%ecx
> c053f30c:       c1 e9 02                shr    $0x2,%ecx
> c053f30f:       8d 3c 30                lea    (%eax,%esi,1),%edi
> c053f312:       03 74 24 04             add    0x4(%esp),%esi
> c053f316:       f3 a5                   rep movsl %ds:(%esi),%es:(%edi)
> <<<<<    memcpy
> ds: 007b esi: c0df7000 es: 007b edi: ebffb000
> 
> It seems one of the skb->frags has been unmapped.
> 
> 
> > I'm thinking blkback will have to make a dom0 copy of the page before
> > doing the unmap if there are still extra references?
> >
> 
> Can the unmap be deferred, handled by the last reference holder?  Or
> does this open up a potential security hole?
> 
When the initial block I/O completes, blkfront is going to remove the
grant, so I think you would have to defer notifying blkfront as well.
That doesn't see workable, since the guest could see the I/O take an
extremely long time, and trigger some timeout.  I think there has to be
a copy made at some point.

        /gary




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

[More with this subject...]

<Prev in Thread]	Current Thread	[Next in Thread>
[Xen-devel] Problem with PV disk and iSCSI, Gary Grebus Re: [Xen-devel] Problem with PV disk and iSCSI, Stefan de Konink Re: [Xen-devel] Problem with PV disk and iSCSI, Kurt Hackel Re: [Xen-devel] Problem with PV disk and iSCSI, Keir Fraser Re: [Xen-devel] Problem with PV disk and iSCSI, Gary Grebus Re: [Xen-devel] Problem with PV disk and iSCSI, Keir Fraser Re: [Xen-devel] Problem with PV disk and iSCSI, Gary Grebus <=

Previous by Date:	Re: [Xen-devel] Qemu vnc color depth, Daniel P. Berrange
Next by Date:	Re: [Xen-devel] Qemu vnc color depth, Stefano Stabellini
Previous by Thread:	Re: [Xen-devel] Problem with PV disk and iSCSI, Keir Fraser
Next by Thread:	[Xen-devel] dom0 and apicid not equal to cpuid, Langsdorf, Mark
Indexes:	[Date] [Thread] [Top] [All Lists]