WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: [kvm-devel] [PATCH RFC 1/3] virtio infrastructure

To: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
Subject: [Xen-devel] Re: [kvm-devel] [PATCH RFC 1/3] virtio infrastructure
From: Avi Kivity <avi@xxxxxxxxxxxx>
Date: Sun, 03 Jun 2007 08:28:50 +0300
Cc: Jimi Xenidis <jimix@xxxxxxxxxxxxxx>, Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx>, Xen Mailing List <xen-devel@xxxxxxxxxxxxxxxxxxx>, "jmk@xxxxxxxxxxxxxxxxxxx" <jmk@xxxxxxxxxxxxxxxxxxx>, Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>, kvm-devel <kvm-devel@xxxxxxxxxxxxxxxxxxxxx>, virtualization <virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx>, Christian Borntraeger <cborntra@xxxxxxxxxx>, Suzanne McIntosh <skranjac@xxxxxxxxxx>, Martin Schwidefsky <schwidefsky@xxxxxxxxxx>
Delivery-date: Sat, 02 Jun 2007 22:27:23 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <1180777836.9228.44.camel@xxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1180613947.11133.58.camel@xxxxxxxxxxxxxxxxxxxxx> <46610E8D.10706@xxxxxxxxxxxx> <1180777836.9228.44.camel@xxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.0 (X11/20070419)
Rusty Russell wrote:
> On Sat, 2007-06-02 at 09:30 +0300, Avi Kivity wrote:
>   
>> Rusty Russell wrote:
>>     
>>> + * virtio_ops - virtio abstraction layer
>>> + * @add_outbuf: prepare to send data to the other end:
>>> + * vdev: the virtio_device
>>> + * sg: the description of the buffer(s).
>>> + * num: the size of the sg array.
>>> + * used: the length sent (set once sending is done).
>>> + *      Returns an identifier or an error.
>>> + * @add_inbuf: prepare to receive data from the other end:
>>> + * vdev: the virtio_device
>>> + * sg: the description of the buffer(s).
>>> + * num: the size of the sg array.
>>> + * used: the length sent (set once data received).
>>> + *      Returns an identifier or an error (eg. -ENOSPC).
>>>   
>>>       
>> Instead of 'used', how about a completion callback (with associated data
>> pointer)?  A new helper, virtio_complete(), would call the callback for
>> all completed requests.  It would eliminate all the tedious scanning
>> used to match the identifier.
>>     
>
> Hi Avi,
>
>       There were several considerations here.  My first was that the drivers
> look much more like normal devices than getting a callback for every
> buffer.   Secondly, "used" batches much more nicely than a completion.
> Finally, it's also something you really want to know, so the driver
> doesn't have to zero its inbufs (an untrusted other side says it sends
> you 1500 bytes but actually sent nothing, and now you spray kernel
> memory out the NIC).
>   

Sure, 'used' is important (how else do you get the packet size on
receive?),  I'm just bitching about the linear scan.

>       I also considered some scheme like:
>
>       struct virtio_used_info
>       {
>               unsigned long len;
>               void *next_token;
>       };
>       ...
>       unsigned long (*add_outbuf)(struct virtio_device *vdev,
>                                   const struct scatterlist sg[],
>                                   unsigned int num,
>                                   void *token,
>                                   struct virtio_used_info *used_info);
>
> So the "used" becomes a "used/next" pair and you can just walk the
> linked list.  But I wasn't convinced that walking the buffers is going
> to be a performance issue (tho the net driver puts them in a continuous
> array for cache friendliness as a nod to this concern).
>   

Well, if you have 256 slots per direction, you will scan 4KB of memory
per interrupt (256 slots x 2 directions x 8 bytes).  You may need a
queue length greater than 256 for a high bandwidth interface, if you
want to reduce your interrupt rate, or if your guest is scheduled out
for lengthy periods (say, a millisecond).


>   
>> It would also be nice to support a bit of non-buffer data, like a set of
>> bitflags.
>>     
>
>       I expect this might be necessary, but it wasn't so far.  The non-buffer
> data tends to go in sg[0]: the block driver works this way, and the
> network driver will for GSO.  Of course, a specialized virtio_ops
> backend might well take this and put the info somewhere else.
>   

Yeah.  Well, it's a somewhat roundabout way of doing things.


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel