On Mon, 2010-03-08 at 18:56 -0500, Joanna Rutkowska wrote:
> On 03/09/2010 12:52 AM, Daniel Stodden wrote:
> > On Mon, 2010-03-08 at 18:30 -0500, Joanna Rutkowska wrote:
> >> On 03/09/2010 12:22 AM, Daniel Stodden wrote:
> >>> On Sun, 2010-03-07 at 11:12 -0500, Pasi Kärkkäinen wrote:
> >>>> On Sun, Mar 07, 2010 at 02:39:09PM +0000, Keir Fraser wrote:
> >>>>> On 07/03/2010 14:36, "Pasi Kärkkäinen" <pasik@xxxxxx> wrote:
> >>>>>
> >>>>>>> Tried a few times and no luck reproducing so far. I hope some other
> >>>>>>> people
> >>>>>>> on the list also will give it a go, since it's so easy to try it out.
> >>>>>>>
> >>>>>>
> >>>>>> I'm able to reproduce this with xen/master 2.6.31.6 dom0 kernel (from
> >>>>>> 2010-02-20),
> >>>>>> but I'm not able to reproduce it with the current xen/stable 2.6.32.9.
> >>>>>>
> >>>>>> I'll try with the most recent 2.6.31.6 dom0 kernel aswell..
> >>>>>
> >>>>> Thanks Pasi!
> >>>>>
> >>>>
> >>>> It seems to happen with the latest xen/master 2.6.31.6 aswell!
> >>>
> >>> Does this look to you like we're corrupting memory or on-disk storage?
> >>>
> >>> E.g. does a
> >>> $ dd if=/dev/zero bs=1M | hexdump -C
> >>> have the same issue?
> >>>
> >>
> >> I think there might be a chance that the above executes correctly, even
> >> if we have memory corruption -- this might be e.g. because the actual
> >> "dest" buffer here would be much smaller than the fs cache buffer used
> >> when we copy onto disk. And so our small dest buffer might just not be
> >> so likely to be hit with this presumably random corruption.
> >>
> >> Perhaps dd'ing onto /dev/shm would be a better way to check this?
> >
> > I agree that a negative doesn't mean much. I'm just poking around there
> > because the positive would have mattered: If we still get to see it,
> > we're out of the storage discussion and can focus on memory corruption.
> >
>
> If you're thinking about a potential Dom0 disk-driver problem, then I
> think we can rule this out. This is because I have tried this on both
> encrypted and non-encrypted filesystems, but the pattern of corruptions
> was exactly the same. If the disk driver was feeding LUKS (the crypto
> driver) with a wrong data, the corruptions would definitely look
> differently.
I'm not considering the device drivers, rather everything on top of
that. Also I didn't understand the issue is present in dom0 by the time
I wrote that.
Still, it'd help to figure it of the corruption came in on the way up
from /dev/zero or down to the disk.
That dd crossing the page cache means it's still got a long way to go.
For now I'm most of all glad to hear it's not in the backends, so far
thanks for that :o)
Daniel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|