|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] domU is causing misaligned disk writes
To: |
Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> |
Subject: |
Re: [Xen-devel] domU is causing misaligned disk writes |
From: |
Tracy Reed <treed@xxxxxxxxxxxxxxx> |
Date: |
Tue, 20 Apr 2010 14:19:13 -0700 |
Cc: |
xen-devel@xxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, Aoetools-discuss@xxxxxxxxxxxxxxxxxxxxx |
Delivery-date: |
Tue, 20 Apr 2010 14:20:05 -0700 |
Envelope-to: |
www-data@xxxxxxxxxxxxxxxxxxx |
In-reply-to: |
<20100420202519.GB9220@xxxxxxxxxxxxxxxxxxx> |
List-help: |
<mailto:xen-devel-request@lists.xensource.com?subject=help> |
List-id: |
Xen developer discussion <xen-devel.lists.xensource.com> |
List-post: |
<mailto:xen-devel@lists.xensource.com> |
List-subscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe> |
List-unsubscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe> |
Mail-followup-to: |
Tracy Reed <treed@xxxxxxxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, Pasi Kärkkäinen <pasik@xxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, Aoetools-discuss@xxxxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx |
References: |
<20100420080958.GN5660@xxxxxxxxxxxxx> <20100420084955.GV1878@xxxxxxxxxxx> <20100420200004.GQ5660@xxxxxxxxxxxxx> <20100420202519.GB9220@xxxxxxxxxxxxxxxxxxx> |
Sender: |
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
On Tue, Apr 20, 2010 at 04:25:19PM -0400, Konrad Rzeszutek Wilk spake thusly:
> The DomU disk from the Dom0 perspective is using 'phy' which means
> there is no caching in Dom0 for that disk (but it is in DomU).
That is fine. I don't particularly want caching in dom0.
> Caching should be done in DomU in that case - which begs the question -
> how much memory do you have in your DomU? What happens if you
> give to both Dom0 and DomU the same amount of memory?
4G in domU and 1G in dom0.
> OK. That is possibly caused by the fact that you are caching the data.
> Look at your buffers cache (and drop the cache before this) and see
> how it grows.
I try to use large amounts of data so cache is less a factor but I
also drop the cache before each test using:
echo 1 > /proc/sys/vm/drop_caches.
I had to start doing this not only to ensure accurate results but also
because the way it was caching the reads was really confusing when I
would see a test start out apparently fine and writing at good speed
according to iostat and then suddenly start hitting the disk with
reads when it ran into data which it did not already have read into
cache.
> How do you know this is a mis-aligned sectors issue? Is this what your
> AOE vendor is telling you ?
No AoE vendor involved. I am using the free stuff. I think it is a
misalignment issue because during a purely write test it is doing
massive amounts of reading according to iostat.
Also note that there are several different kinds of misalignment which
can occur:
- Disk sector misalignment
- RAID chunk size misalignment
- Page cache misalignment
Would the first two necessarily show up in iostat? I'm not sure if
disk sector misalignment is dealth with automatically in the hardware
or if the kernel aligns it for us. RAID chunk size misalignment seems
like it would be dealth with in the RAID card if using hardware
RAID. But I am not. So the software RAID implementation might cause
reads to show up in iostat.
Linux page cache size is 4k which is why I am using 4k block size in
my dd tests.
> I was thinking of first eliminating caching from the picture and seeing
> the speeds you get when you do direct IO to the spindles. You can do this
> using
> a tool called 'fio' or 'dd' with the oflag=direct. Try doing that from
> both Dom0 and DomU and see what the speeds are.
I have never been quite clear on the purpose of oflag=direct. I have
read in the dd man page tht it is supposed to bypass cache. But
whenever I use it performance is horrible beyond merely just not
caching. I am doing the above dd with oflag=direct now as you
suggested and I see around 30 seconds of nothing hitting the disks and
then two or three seconds of writing in iostat on the target. I just
ctrl-c'd the dd and it shows:
#dd if=/dev/zero of=/dev/etherd/e6.1 oflag=direct bs=4096
count=3000000
1764883+0 records in
1764883+0 records out
7228960768 bytes (7.2 GB) copied, 402.852 seconds, 17.9 MB/s
But even on my local directly attached SATA workstation disk when
doing that same dd on an otherwise idle machine I see performance
like:
$ dd if=/dev/zero of=foo.test bs=4096 count=4000000
C755202+0 records in
755202+0 records out
3093307392 bytes (3.1 GB) copied, 128.552 s, 24.1 MB/s
which again suggests that oflag=direct isn't doing quite what I expect.
--
Tracy Reed
http://tracyreed.org
pgppkPTzERhJU.pgp
Description: PGP signature
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|