On Thu, 2010-08-12 at 14:36 -0400, Yuehai Xu wrote:
> On Thu, Aug 12, 2010 at 2:21 PM, Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote:
> > On 08/12/2010 11:18 AM, Yuehai Xu wrote:
> >>
> >> On Thu, Aug 12, 2010 at 2:16 PM, Yuehai Xu<yuehaixu@xxxxxxxxx> wrote:
> >>>
> >>> On Thu, Aug 12, 2010 at 2:04 PM, Jeremy Fitzhardinge<jeremy@xxxxxxxx>
> >>> wrote:
> >>>>
> >>>> On 08/11/2010 08:42 PM, Yuehai Xu wrote:
> >>>>>
> >>>>> However, the result turns out that my assumption is wrong. The number
> >>>>> of pending requests, according to the trace of blktrace, is changing
> >>>>> like this way: 9 8 7 6 5 4 3 2 1 1 1 2 3 4 5 4 3 2 1 1 1 2 3 4 5 6 7 8
> >>>>> 8 8..., just like a curve.
> >>>>>
> >>>>> I am puzzled about this weird result. Can anybody explain what has
> >>>>> happened between domU and dom0 for this result? Does this result make
> >>>>> sense? or I did something wrong to get this result.
> >>>>
> >>>> If you're using a journalled filesystem in the guest, it will be need to
> >>>> drain the IO queue periodically to control the write ordering. You
> >>>> should
> >>>> also observe barrier writes in the blkfront stream.
> >>>>
> >>>> J
> >>>>
> >>> The file system I use in the guest system is ext3, which is a
> >>> journaled file system. However, I don't quite understand what you said
> >>> ".. control the write ordering" because the 10 processes running in
> >>> the guest system all just send requests, there is no write request.
> >>> What do you mean of "barrier writes" here?
> >>>
> >>> Thanks,
> >>> Yuehai
> >>>
> >> I am sorry for the missing word, the requests sent by the 10 processes
> >> in the guest system are all read requests.
> >
> > Even a pure read-only workload may generate writes for metadata unless
> > you've turned it off. Is it a read-only mount? Do you have the noatime
> > mount option? Is the device itself read-only?
> >
>
> The definition of my disk is: ['tap2:aio:/PATH/dom.img, hda1, w'], so,
> I think it should not be read-only mount, and I don't set any specific
> option for mount. The device itself should be read-write.
>
>
> > Still, it seems odd that it won't/can't keep the queue full of read
> > requests. Unless its getting local cache hits?
> >
> > J
> >
>
> I don't think the local cache would be hit because every time I did
> the test, I drop the cache both in the guest and host OS. And, the
> access pattern is stride read, it is impossible to hit the cache.
>
> I am not sure whether there are write requests, even there are, I
> think the number of write requests should be very small, will it
> affect the I/O queue of guest or host? I don't think so. The common
> sense should be that the I/O queue in the host system should be almost
> full because tapdisk2 is async.
Most of what is coming to my mind has already been mentioned above.
Maybe try a read-only mount to avoid metadata updates.
What do you mean by stride read? Just reads with some fixed stride? What
stride size? Did you make sure to turned off OS readahead (iirc 128k)?
What's the underlying storage type? If it's a file, was the data fully
preallocated?
If the request offsets qualify for a merge, then blktap will do so quite
aggressively, so you will see a lot of the I/O complete discretely not
incrementally request-by-request.
How did you sample the pending number of requests?
Cheers,
Daniel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|