|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.
Keir Fraser wrote:
> On 03/12/2008 11:27, "James Harper" <james.harper@xxxxxxxxxxxxxxxx> wrote:
>
>> Alternatively it could be a combination of the gplpv drivers and netback
>> or blkback. I'm pretty sure that I had the problem before I started
>> testing pvscsi...
>>
>> The machine I am testing on will be busy for the rest of the night, but
>> tomorrow I'll do some testing and see what happens, unless you can
>> suggest a way I could discover what those pages belong to in the
>> meantime?
>
> Unfortunately it's a bit of a pain in the butt since we don't have full page
> tracking in Xen -- we only know that *someone* *somewhere* has that page
> mapped for *some* purpose. Indeed even with more tracking Xen can only
> really tell you which domain holds the reference, and that's bound to be
> dom0 (unless this is a bogus refcounting bug in Xen itself).
We have been investigating a similar sounding bug (hung pages with elevated
reference counts) that occur when blkback requests are issued over an iSCSI
backend device. The block requests appear to be running afoul of the lazy copy
optimization added for netback. In this path, foreign pages (assumed to be
netback pages?) are manipulated specially by the dma layer of the dom0 network
stack. On return to netback, the page refs are cleaned up.
In our case, the foreign pages actually originate from blkback, are passed to
iSCSI for processing, and are abused by the ref manipulation in the dom0
network stack. On return to blkback, the page refs are off. What we haven't
been able to do yet, is identify the exact circumstances that trigger the
issue. We have a fairly elaborate reproducer involving running a pool of
domains and continuously rebooting them. Eventually, one domain will hang on
exit with a stuck page with elevated ref counts.
In our case, the stuck page is always a blkback I/O page.
Running the same test on a FC SAN or local SCSI backend device doesn't hang.
- Steve
> I would suggest dumping addresses of interesting control pages in your
> backend drivers (some can log that already if built with debugging I think),
> then match up the address of the remaining page in the zombie domain.
>
> -- Keir
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- RE: [Xen-devel] can't create any more pv-on-hvm domains after ~38under 3.3-testing, (continued)
- RE: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.3-testing, James Harper
- Re: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.3-testing,
Steve Ofsthun <=
- Re: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.3-testing, Keir Fraser
- RE: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.3-testing, James Harper
- RE: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.3-testing, James Harper
- Re: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.3-testing, Keir Fraser
- RE: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.3-testing, James Harper
- Re: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.3-testing, Keir Fraser
- RE: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.3-testing, James Harper
- Re: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.3-testing, Keir Fraser
- Re: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.3-testing, Keir Fraser
- RE: [Xen-devel] can't create any more pv-on-hvm domains after~38under 3.3-testing, James Harper
|
|
|
|
|