WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] add trim command to blkback interface

To: Owen Smith <owen.smith@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] add trim command to blkback interface
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Mon, 06 Dec 2010 12:26:22 -0800
Cc: Xen Devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 06 Dec 2010 12:27:17 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <DECC4DEE86A8074C9ABE8DFD330CB6E38D07A16D3D@xxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <DECC4DEE86A8074C9ABE8DFD330CB6E38D07A16D3D@xxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101103 Fedora/1.0-0.33.b2pre.fc14 Lightning/1.0b3pre Thunderbird/3.1.6
On 12/06/2010 03:28 AM, Owen Smith wrote:
>
> [PATCH] add trim command to blkback interface
>
>  
>
> This patch adds the trim operation to the blkback ring protocol.
>
> The ring protocol is extended by unioning the current read/write/write
> barrier specific fields with fields specific for trim, and allowing
> further expansion with additional union members, providing the
> structure size and alignment do not change.
>
>  
>
> Blkback will respond to trim operations with a BLKIF_RSP_EOPNOTSUPP,
> to avoid using the default response of writing an error log entry and
> returning BLKIF_RSP_ERROR.
>
>  
>
> Trim commands are passed with sector_number as the sector index to
> begin trim operations at, and nr_sectors as the number of sectors to
> be trimmed. These sectors should be trimmed if the underlying block
> device supports trimming. More information about trim commands:
>
> http://t13.org/Documents/UploadedDocuments/docs2008/e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc
>

Aside from the comments in my reply to Jan's mail, I can't actually get
an applyable patch out of this; its all mashed up with
quoted-printable.  Could you repost as a plain-text attachment if you
can't convince your mailer to send a plain-text email?

Thanks,
    J

>  
>
> This patch is intended to fix an interface, not supply the
> implementation of trim in the backend drivers.
>
>  
>
> Signed-off-by: Owen Smith <owen.smith@xxxxxxxxxx>
>
>  
>
>  
>
> diff --git a/drivers/xen/blkback/blkback.c b/drivers/xen/blkback/blkback.c
>
> index 0bef445..bd09512 100644
>
> --- a/drivers/xen/blkback/blkback.c
>
> +++ b/drivers/xen/blkback/blkback.c
>
> @@ -367,6 +367,11 @@ static int do_block_io_op(blkif_t *blkif)
>
>                                                blkif->st_wr_req++;
>
>                                               
> dispatch_rw_block_io(blkif, &req, pending_req);
>
>                                                break;
>
> +                             case BLKIF_OP_TRIM:
>
> +                                             make_response(blkif,
> req.id, req.operation,
>
> +                                                                  
> BLKIF_RSP_EOPNOTSUPP);
>
> +                                             free_req(pending_req);
>
> +                                             break;
>
>                                default:
>
>                                                /* A good sign
> something is wrong: sleep for a while to
>
>                                                 * avoid excessive CPU
> consumption by a bad guest. */
>
> @@ -424,7 +429,7 @@ static void dispatch_rw_block_io(blkif_t *blkif,
>
>                }
>
>                 preq.dev           = req->handle;
>
> -              preq.sector_number = req->sector_number;
>
> +             preq.sector_number = req->rw.sector_number;
>
>                preq.nr_sects      = 0;
>
>                 pending_req->blkif     = blkif;
>
> @@ -436,11 +441,11 @@ static void dispatch_rw_block_io(blkif_t *blkif,
>
>                for (i = 0; i < nseg; i++) {
>
>                                uint32_t flags;
>
> -                              seg[i].nsec = req->seg[i].last_sect -
>
> -                                              req->seg[i].first_sect + 1;
>
> +                             seg[i].nsec = req->rw.seg[i].last_sect -
>
> +                                            
> req->rw.seg[i].first_sect + 1;
>
> -                              if ((req->seg[i].last_sect >=
> (PAGE_SIZE >> 9)) ||
>
> -                                  (req->seg[i].last_sect <
> req->seg[i].first_sect))
>
> +                             if ((req->rw.seg[i].last_sect >=
> (PAGE_SIZE >> 9)) ||
>
> +                                 (req->rw.seg[i].last_sect <
> req->rw.seg[i].first_sect))
>
>                                                goto fail_response;
>
>                                preq.nr_sects += seg[i].nsec;
>
> @@ -448,7 +453,7 @@ static void dispatch_rw_block_io(blkif_t *blkif,
>
>                                if (operation != READ)
>
>                                                flags |= GNTMAP_readonly;
>
>                                gnttab_set_map_op(&map[i],
> vaddr(pending_req, i), flags,
>
> -                                                               
> req->seg[i].gref, blkif->domid);
>
> +                                                              
> req->rw.seg[i].gref, blkif->domid);
>
>                }
>
>                 ret =
> HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, map, nseg);
>
> @@ -466,11 +471,11 @@ static void dispatch_rw_block_io(blkif_t *blkif,
>
>                                               
> page_to_pfn(pending_page(pending_req, i)),
>
>                                               
> FOREIGN_FRAME(map[i].dev_bus_addr >> PAGE_SHIFT));
>
>                                seg[i].buf  = map[i].dev_bus_addr |
>
> -                                              (req->seg[i].first_sect
> << 9);
>
> +                                            
> (req->rw.seg[i].first_sect << 9);
>
>                               
> blkback_pagemap_set(vaddr_pagenr(pending_req, i),
>
>                                                               
>     pending_page(pending_req, i),
>
>                                                               
>     blkif->domid, req->handle,
>
> -                                                                 
> req->seg[i].gref);
>
> +                                                                
> req->rw.seg[i].gref);
>
>                                pending_handle(pending_req, i) =
> map[i].handle;
>
>                }
>
> diff --git a/include/xen/blkif.h b/include/xen/blkif.h
>
> index 7172081..e727e5d 100644
>
> --- a/include/xen/blkif.h
>
> +++ b/include/xen/blkif.h
>
> @@ -44,8 +44,18 @@ struct blkif_x86_32_request {
>
>                uint8_t        nr_segments;  /* number of
> segments                   */
>
>                blkif_vdev_t   handle;       /* only for read/write
> requests         */
>
>                uint64_t       id;           /* private guest value,
> echoed in resp  */
>
> -              blkif_sector_t sector_number;/* start sector idx on
> disk (r/w only)  */
>
> -              struct blkif_request_segment
> seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
>
> +
>
> +             union {
>
> +                             struct blkif_x86_32_request_rw {
>
> +                                             blkif_sector_t
> sector_number;/* start sector idx on disk (r/w only)  */
>
> +                                             struct
> blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
>
> +                             } rw;
>
> +
>
> +                             struct blkif_x86_32_request_trim {
>
> +                                             blkif_sector_t
> sector_number;
>
> +                                             uint64_t       nr_sectors;
>
> +                             } trim;
>
> +             };
>
> };
>
> struct blkif_x86_32_response {
>
>                uint64_t        id;              /* copied from request */
>
> @@ -62,8 +72,18 @@ struct blkif_x86_64_request {
>
>                uint8_t        nr_segments;  /* number of
> segments                   */
>
>                blkif_vdev_t   handle;       /* only for read/write
> requests         */
>
>                uint64_t       __attribute__((__aligned__(8))) id;
>
> -              blkif_sector_t sector_number;/* start sector idx on
> disk (r/w only)  */
>
> -              struct blkif_request_segment
> seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
>
> +
>
> +             union {
>
> +                             struct blkif_x86_64_request_rw {
>
> +                                             blkif_sector_t
> sector_number;/* start sector idx on disk (r/w only)  */
>
> +                                             struct
> blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
>
> +                             } rw;
>
> +
>
> +                             struct blkif_x86_64_request_trim {
>
> +                                             blkif_sector_t
> sector_number;
>
> +                                             uint64_t       nr_sectors;
>
> +                             } trim;
>
> +             };
>
> };
>
> struct blkif_x86_64_response {
>
>                uint64_t       __attribute__((__aligned__(8))) id;
>
> @@ -97,12 +117,24 @@ static void inline blkif_get_x86_32_req(struct
> blkif_request *dst, struct blkif_
>
>                dst->nr_segments = src->nr_segments;
>
>                dst->handle = src->handle;
>
>                dst->id = src->id;
>
> -              dst->sector_number = src->sector_number;
>
> -              barrier();
>
> -              if (n > dst->nr_segments)
>
> -                              n = dst->nr_segments;
>
> -              for (i = 0; i < n; i++)
>
> -                              dst->seg[i] = src->seg[i];
>
> +             switch (src->operation) {
>
> +             case BLKIF_OP_READ:
>
> +             case BLKIF_OP_WRITE:
>
> +             case BLKIF_OP_WRITE_BARRIER:
>
> +                             dst->rw.sector_number =
> src->rw.sector_number;
>
> +                             barrier();
>
> +                             if (n > dst->nr_segments)
>
> +                                             n = dst->nr_segments;
>
> +                             for (i = 0; i < n; i++)
>
> +                                             dst->rw.seg[i] =
> src->rw.seg[i];
>
> +                             break;
>
> +             case BLKIF_OP_TRIM:
>
> +                             dst->trim.sector_number =
> src->trim.sector_number;
>
> +                             dst->trim.nr_sectors = src->trim.nr_sectors;
>
> +                             break;
>
> +             default:
>
> +                             break;
>
> +             }
>
> }
>
>  static void inline blkif_get_x86_64_req(struct blkif_request *dst,
> struct blkif_x86_64_request *src)
>
> @@ -112,12 +144,24 @@ static void inline blkif_get_x86_64_req(struct
> blkif_request *dst, struct blkif_
>
>                dst->nr_segments = src->nr_segments;
>
>                dst->handle = src->handle;
>
>                dst->id = src->id;
>
> -              dst->sector_number = src->sector_number;
>
> -              barrier();
>
> -              if (n > dst->nr_segments)
>
> -                              n = dst->nr_segments;
>
> -              for (i = 0; i < n; i++)
>
> -                              dst->seg[i] = src->seg[i];
>
> +             switch (src->operation) {
>
> +             case BLKIF_OP_READ:
>
> +             case BLKIF_OP_WRITE:
>
> +             case BLKIF_OP_WRITE_BARRIER:
>
> +                             dst->rw.sector_number =
> src->rw.sector_number;
>
> +                             barrier();
>
> +                             if (n > dst->nr_segments)
>
> +                                             n = dst->nr_segments;
>
> +                             for (i = 0; i < n; i++)
>
> +                                             dst->rw.seg[i] =
> src->rw.seg[i];
>
> +                             break;
>
> +             case BLKIF_OP_TRIM:
>
> +                             dst->trim.sector_number =
> src->trim.sector_number;
>
> +                             dst->trim.nr_sectors = src->trim.nr_sectors;
>
> +                             break;
>
> +             default:
>
> +                             break;
>
> +             }
>
> }
>
>  #endif /* __XEN_BLKIF_H__ */
>
> diff --git a/include/xen/interface/io/blkif.h
> b/include/xen/interface/io/blkif.h
>
> index 68dd2b4..54bb598 100644
>
> --- a/include/xen/interface/io/blkif.h
>
> +++ b/include/xen/interface/io/blkif.h
>
> @@ -43,6 +43,17 @@ typedef uint64_t blkif_sector_t;
>
>   * create the "feature-barrier" node!
>
>   */
>
> #define BLKIF_OP_WRITE_BARRIER     2
>
> +/*
>
> + * Recognised only if "feature-trim" is present in backend xenbus info.
>
> + * The "feature_trim" node contains a boolean indicating whether trim
>
> + * requests are likely to succeed or fail. Either way, a trim request
>
> + * may fail at any time with BLKIF_RSP_EOPNOTSUPP if it is unsupported by
>
> + * the underlying block-device hardware. The boolean simply indicates
> whether
>
> + * or not it is worthwhile for the frontend to attempt trim requests.
>
> + * If a backend does not recognise BLKIF_OP_TRIM, it should *not*
>
> + * create the "feature-trim" node!
>
> + */
>
> +#define BLKIF_OP_TRIM             4
>
>  /*
>
>   * Maximum scatter/gather segments per request.
>
> @@ -56,13 +67,23 @@ struct blkif_request {
>
>                uint8_t        nr_segments;  /* number of
> segments                   */
>
>                blkif_vdev_t   handle;       /* only for read/write
> requests         */
>
>                uint64_t       id;           /* private guest value,
> echoed in resp  */
>
> -              blkif_sector_t sector_number;/* start sector idx on
> disk (r/w only)  */
>
> -              struct blkif_request_segment {
>
> -                              grant_ref_t gref;        /* reference
> to I/O buffer frame        */
>
> -                              /* @first_sect: first sector in frame
> to transfer (inclusive).   */
>
> -                              /* @last_sect: last sector in frame to
> transfer (inclusive).     */
>
> -                              uint8_t     first_sect, last_sect;
>
> -              } seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
>
> +
>
> +             union {
>
> +                             struct blkif_request_rw {
>
> +                                             blkif_sector_t
> sector_number;/* start sector idx on disk (r/w only)  */
>
> +                                             struct
> blkif_request_segment {
>
> +                                                            
> grant_ref_t gref;        /* reference to I/O buffer frame        */
>
> +                                                             /*
> @first_sect: first sector in frame to transfer (inclusive).   */
>
> +                                                             /*
> @last_sect: last sector in frame to transfer (inclusive).     */
>
> +                                                            
> uint8_t     first_sect, last_sect;
>
> +                                             }
> seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
>
> +                             } rw;
>
> +
>
> +                             struct blkif_request_trim {
>
> +                             blkif_sector_t sector_number;
>
> +                                     uint64_t       nr_sectors;
>
> +                             } trim;
>
> +             };
>
> };
>
>  struct blkif_response {
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel