WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] xen/blkback: Don't let in-flight requests defer

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] xen/blkback: Don't let in-flight requests defer pending ones.
From: Daniel Stodden <daniel.stodden@xxxxxxxxxx>
Date: Tue, 31 May 2011 09:30:02 -0700
Cc: "pradeepv@xxxxxxxxxx" <pradeepv@xxxxxxxxxx>, Xen <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 31 May 2011 09:31:37 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20110531160807.GA31639@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1306613567.20284.83.camel@ramone> <1306614070-13137-1-git-send-email-daniel.stodden@xxxxxxxxxx> <20110531160807.GA31639@xxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Tue, 2011-05-31 at 12:08 -0400, Konrad Rzeszutek Wilk wrote:
> On Sat, May 28, 2011 at 01:21:10PM -0700, Daniel Stodden wrote:
> > Running RING_FINAL_CHECK_FOR_REQUESTS from make_response is a bad
> > idea. It means that in-flight I/O is essentially blocking continued
> > batches. This essentially kills throughput on frontends which unplug
> > (or even just notify) early and rightfully assume addtional requests
> 
> You have any perf numbers?

This was on iSCSI. Single stripe, not a big array. Case of interest was
a simple linear dd, 1GB. Dom0 gets to around 70M/s on the raw device
node.

Guest before:

dd if=/dev/zero of=/dev/xvdc bs=1M count=1024
1073741824 bytes (1.1 GB) copied, 24.5 s, 43.8 MB/s
1073741824 bytes (1.1 GB) copied, 23.9216 s, 44.9 MB/s
1073741824 bytes (1.1 GB) copied, 23.336 s, 46.0 MB/s

Guest after:

dd if=/dev/zero of=/dev/xvdc bs=1M count=1024
1073741824 bytes (1.1 GB) copied, 17.0105 s, 63.1 MB/s
1073741824 bytes (1.1 GB) copied, 17.7566 s, 60.5 MB/s
1073741824 bytes (1.1 GB) copied, 17.163 s, 62.6 MB/s

This is not a silver bullet. With faster backends, there are more issues
with queue depth and request assembly, but one main blocker is what
Pradeep first described.

Daniel

> > will be picked up on time, not synchronously.
> > 
> > Signed-off-by: Daniel Stodden <daniel.stodden@xxxxxxxxxx>
> > ---
> >  drivers/block/xen-blkback/blkback.c |   36 
> > ++++++++++++++++++----------------
> >  1 files changed, 19 insertions(+), 17 deletions(-)
> > 
> > diff --git a/drivers/block/xen-blkback/blkback.c 
> > b/drivers/block/xen-blkback/blkback.c
> > index 9dee545..48ad7fa 100644
> > --- a/drivers/block/xen-blkback/blkback.c
> > +++ b/drivers/block/xen-blkback/blkback.c
> > @@ -451,7 +451,8 @@ static void end_block_io_op(struct bio *bio, int error)
> >   * (which has the sectors we want, number of them, grant references, etc),
> >   * and transmute  it to the block API to hand it over to the proper block 
> > disk.
> >   */
> > -static int do_block_io_op(struct xen_blkif *blkif)
> > +static int
> > +__do_block_io_op(struct xen_blkif *blkif)
> >  {
> >     union blkif_back_rings *blk_rings = &blkif->blk_rings;
> >     struct blkif_request req;
> > @@ -508,6 +509,23 @@ static int do_block_io_op(struct xen_blkif *blkif)
> >     return more_to_do;
> >  }
> >  
> > +static int
> > +do_block_io_op(blkif_t *blkif)
> > +{
> > +   blkif_back_rings_t *blk_rings = &blkif->blk_rings;
> > +   int more_to_do;
> > +
> > +   do {
> > +           more_to_do = __do_block_io_op(blkif);
> > +           if (more_to_do)
> > +                   break;
> > +
> > +           RING_FINAL_CHECK_FOR_REQUESTS(&blk_rings->common, more_to_do);
> > +   } while (more_to_do);
> > +
> > +   return more_to_do;
> > +}
> > +
> >  /*
> >   * Transmutation of the 'struct blkif_request' to a proper 'struct bio'
> >   * and call the 'submit_bio' to pass it to the underlying storage.
> > @@ -698,7 +716,6 @@ static void make_response(struct xen_blkif *blkif, u64 
> > id,
> >     struct blkif_response  resp;
> >     unsigned long     flags;
> >     union blkif_back_rings *blk_rings = &blkif->blk_rings;
> > -   int more_to_do = 0;
> >     int notify;
> >  
> >     resp.id        = id;
> > @@ -725,22 +742,7 @@ static void make_response(struct xen_blkif *blkif, u64 
> > id,
> >     }
> >     blk_rings->common.rsp_prod_pvt++;
> >     RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&blk_rings->common, notify);
> > -   if (blk_rings->common.rsp_prod_pvt == blk_rings->common.req_cons) {
> > -           /*
> > -            * Tail check for pending requests. Allows frontend to avoid
> > -            * notifications if requests are already in flight (lower
> > -            * overheads and promotes batching).
> > -            */
> > -           RING_FINAL_CHECK_FOR_REQUESTS(&blk_rings->common, more_to_do);
> > -
> > -   } else if (RING_HAS_UNCONSUMED_REQUESTS(&blk_rings->common)) {
> > -           more_to_do = 1;
> > -   }
> > -
> >     spin_unlock_irqrestore(&blkif->blk_ring_lock, flags);
> > -
> > -   if (more_to_do)
> > -           blkif_notify_work(blkif);
> >     if (notify)
> >             notify_remote_via_irq(blkif->irq);
> >  }
> > -- 
> > 1.7.4.1
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel