[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] blkback: Fix block I/O latency issue

Thanks Konrad.

>>I presume you have tested this in the production

Yes. Absolutely. 

>>what were the numbers when it came to high bandwith numbers

Under high I/O workload, where the blkfront would fill up the queue as
blkback works the queue, the I/O latency problem in question doesn't
manifest itself and as a result this patch doesn't make much of a
difference in terms of interrupt rate. My benchmarks didn't show any
significant effect.

The above rationale combined with relatively high disk I/O latencies
(compared to IRQ latency) generally minimizes excessive interrupt rate.
Also, blkfront interrupt generation mechanism works exactly the same way
as the patched blkback. Netfront and netback generate interrupts the same
way as well. 

Under 'moderate' I/O workload, the rate of interrupt does go up but the
I/O latency benefit clearly outweighs the cost extra interrupt rate (which
isn't much for moderate I/O load anyways)

Overall, advantages of this patch (I/O latency improvement) outweighs any
potential fringe negative effects by a large margin and the fact that
netfront, netback and blkfront already have the same interrupt generation
design point should give us a lot of confidence.

That said, I do think a comprehensive interrupt throttling mechanism for
netback, blkback and other backend I/O drivers would be useful and should
be pursued as a separate initiative. Such a mechanism would be
particularly useful for netfront-netback stack which is more susceptible
to interrupt storms than blkfront-blkback. 'IRQ coalescing' type mechanism
that could induce delays in the order of  10s of microsecs (certainly not
in millisecs though) to minimize interrupt generation rate would be useful
(similar to what NICs do).


- Pradeep Vincent

On 5/9/11 1:24 PM, "Konrad Rzeszutek Wilk" <konrad.wilk@xxxxxxxxxx> wrote:

>On Tue, May 03, 2011 at 06:54:38PM -0700, Vincent, Pradeep wrote:
>> Hey Daniel,
>> Thanks for your comments.
>> >> The notification avoidance these macros implement does not promote
>> >>deliberate latency. This stuff is not dropping events or deferring
>> requests.
>> It only avoids a gratuitious notification sent by the remote end in
>> cases where the local one didn't go to sleep yet, and therefore can
>> >>guarantee that it's going to process the message ASAP, right after
>> >>finishing what's still pending from the previous kick.
>> If the design goal was to simply avoid unnecessary interrupts but not
>> delay I/Os, then blkback code has a bug.
>> If the design goal was to delay the I/Os in order to reducing interrupt
>> rate, then I am arguing that the design introduces way too much latency
>> that affects many applications.
>> Either way, this issue needs to be addressed.
>I agree we need to fix this. What I am curious is:
> - what are the workloads under which this patch has a negative effect.
> - I presume you have tested this in the production - what were the
>   when it came to high bandwith numbers (so imagine, four or six threads
>   putting as much I/O as possible)? Did the level of IRQs go way up
>   compared to not running with this patch?
>I am wondering if it might be worth looking in something NAPI-type in the
>block layer (so polling basically). The concern I've is that this
>patch would trigger a interrupt storm for small sized requests which
>might be
>happening at a high rate (say, 512 bytes random writes).
>But perhaps the way for this work is to have a ratelimiting code in it
>so that there is no chance of interrupt storms.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.