This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] [PATCH] blkback: Fix block I/O latency issue

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] blkback: Fix block I/O latency issue
From: "Vincent, Pradeep" <pradeepv@xxxxxxxxxx>
Date: Thu, 12 May 2011 17:40:40 -0700
Accept-language: en-US
Acceptlanguage: en-US
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxxxx>, Daniel Stodden <daniel.stodden@xxxxxxxxxx>
Delivery-date: Thu, 12 May 2011 17:41:44 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20110509202403.GA27755@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcwRBmLgfenCykILT8SGaOOqoTN1pw==
Thread-topic: [Xen-devel] [PATCH] blkback: Fix block I/O latency issue
User-agent: Microsoft-MacOutlook/
Thanks Konrad.

>>I presume you have tested this in the production

Yes. Absolutely. 

>>what were the numbers when it came to high bandwith numbers

Under high I/O workload, where the blkfront would fill up the queue as
blkback works the queue, the I/O latency problem in question doesn't
manifest itself and as a result this patch doesn't make much of a
difference in terms of interrupt rate. My benchmarks didn't show any
significant effect.

The above rationale combined with relatively high disk I/O latencies
(compared to IRQ latency) generally minimizes excessive interrupt rate.
Also, blkfront interrupt generation mechanism works exactly the same way
as the patched blkback. Netfront and netback generate interrupts the same
way as well. 

Under 'moderate' I/O workload, the rate of interrupt does go up but the
I/O latency benefit clearly outweighs the cost extra interrupt rate (which
isn't much for moderate I/O load anyways)

Overall, advantages of this patch (I/O latency improvement) outweighs any
potential fringe negative effects by a large margin and the fact that
netfront, netback and blkfront already have the same interrupt generation
design point should give us a lot of confidence.

That said, I do think a comprehensive interrupt throttling mechanism for
netback, blkback and other backend I/O drivers would be useful and should
be pursued as a separate initiative. Such a mechanism would be
particularly useful for netfront-netback stack which is more susceptible
to interrupt storms than blkfront-blkback. 'IRQ coalescing' type mechanism
that could induce delays in the order of  10s of microsecs (certainly not
in millisecs though) to minimize interrupt generation rate would be useful
(similar to what NICs do).


- Pradeep Vincent

On 5/9/11 1:24 PM, "Konrad Rzeszutek Wilk" <konrad.wilk@xxxxxxxxxx> wrote:

>On Tue, May 03, 2011 at 06:54:38PM -0700, Vincent, Pradeep wrote:
>> Hey Daniel,
>> Thanks for your comments.
>> >> The notification avoidance these macros implement does not promote
>> >>deliberate latency. This stuff is not dropping events or deferring
>> requests.
>> It only avoids a gratuitious notification sent by the remote end in
>> cases where the local one didn't go to sleep yet, and therefore can
>> >>guarantee that it's going to process the message ASAP, right after
>> >>finishing what's still pending from the previous kick.
>> If the design goal was to simply avoid unnecessary interrupts but not
>> delay I/Os, then blkback code has a bug.
>> If the design goal was to delay the I/Os in order to reducing interrupt
>> rate, then I am arguing that the design introduces way too much latency
>> that affects many applications.
>> Either way, this issue needs to be addressed.
>I agree we need to fix this. What I am curious is:
> - what are the workloads under which this patch has a negative effect.
> - I presume you have tested this in the production - what were the
>   when it came to high bandwith numbers (so imagine, four or six threads
>   putting as much I/O as possible)? Did the level of IRQs go way up
>   compared to not running with this patch?
>I am wondering if it might be worth looking in something NAPI-type in the
>block layer (so polling basically). The concern I've is that this
>patch would trigger a interrupt storm for small sized requests which
>might be
>happening at a high rate (say, 512 bytes random writes).
>But perhaps the way for this work is to have a ratelimiting code in it
>so that there is no chance of interrupt storms.

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>