Re: [Xen-devel] [Patch 0/7] pvSCSI driver

Hi Steven-san,

On Thu, 13 Mar 2008 14:30:10 +0000
Steven Smith <steven.smith@xxxxxxxxxxxxx> wrote:
> Backtracking a little, the fundamental goal here is to make some
> logical units which are accessible to dom0 appear inside the guest.
> Guest operating systems are unlikely to be very happy about having
> logical units floating around not attached to scsi hosts, and so we
> need (somehow) to come up with a scsi host which has the right set of
> logical units attached to it.  There are lots of valid use cases in
> which there don't exist physical hosts with the right set of LUs, and
> so somebody needs to invent one, and then emulate it.  That somebody
> will necessarily be either the frontend or the backend.
> 
> Doing the emulation also gives you the option of filtering out things
> like TCQ support in INQUIRY commands, which might be supported by the
> physical device but certainly isn't supported by the pvSCSI protocol.
> 
> If you emulate the HBA in the backend, you get a design like this:
> 
> -- There is usually only one xenbus scsi device attached to any given
>    VM, and that device represents the emulated HBA.
> 
> -- scsifront creates a struct scsi_host (or equivalent) for each
>    xenbus device, and those provide your interface to the rest of the
>    guest operating system.
> 
> -- When the guest OS submits a request to the frontend driver, it gets
>    packaged up and shipped over the ring to the backend pretty much
>    completely unchanged.
> 
> -- The backend figures out what the request is doing, and either:
> 
>    a) Routes it to a physical device, or
>    b) Synthesises an answer (for things like REPORT LUNS), or
>    c) Fails the request (for things like WRITE BUFFER),
> 
>    as appropriate.
> 
> If you emulate the HBA in the frontend, you get a design which looks
> like this:
> 
> -- Each logical unit exposed to the guest has its own xenbus scsi
>    device.
> 
> -- scsifront creates a single struct scsi_host, representing the
>    emulated HBA.
> 
> -- When the guest OS submits a request to the frontend driver, it
>    either:
> 
>    a) Routes it to a Xen scsifront and passes it off to the backend, or
>    b) Synthesises an answer, or
>    c) Fails the request,
> 
>    as appropriate.
> 
> -- When a request reaches the backend, it does a basic check to make
>    sure that it's dealing with one of the whitelisted requests, and
>    then sends it directly to the relevant physical device.  The
>    routing problem is trivial here, because there is only ever one
>    physical device (struct scsi_device in Linux-speak) associated with
>    any xenbus device, and the request is just dropped directly into
>    the relevant request queue.
> 
> The first approach gives you a simple frontend at the expense of a
> complicated backend, while the second one gives you a simple backend
> at the expense of a complicated frontend.  It seems likely that there
> will be more frontend implementations than backend, which suggests
> that putting the HBA emulation in the backend is a better choice.

I agree with your thoughts. On the other hand, I also consider that
the "more frontend implementation" suggests each guest OS has each own
emulation policy, therefore emulating on the frontend is suitable,
maybe. It's very difficult to decide which approach I should take. 
Each approach has both good points and bad points. :-<

However, I would like to take the first approach, emulation on the
backend, according to your and James Smart-san's advise, and to start
implementation. :-)


> The main difference from a performance point of view is that the
> second approach will use a ring for each device, whereas the first has
> a single ring shared across all devices, so you'll get more requests
> in flight with the second scheme.  I'd expect that just making the
> rings larger would have more effect, though, and that's easier when
> there's just one of them.
> 

I expect the Netchannel2 for solving performance issues.


> Looking through the SCSI spec, I don't think we're going to be able to
> get away with passing requests through from the frontend all the way
> to the physical disk without sanity checking the actual CDB in the
> backend.  There are a couple of commands which look scary:
> 
> -- CHANGE ALIAS/REPORT ALIAS -- the alias list is shared across
>    everything in the I_T nexus.  That will lead to interesting issues
>    if you ever have multiple guests modifying it at the same time.
> 
> -- EXTENDED COPY -- allows you to copy arbitrary data between logical
>    units, sometimes even ones not in the same target device.  That's
>    obviously going to need to be controlled in a VM setting.
> 
> -- Some mode pages, as modified by MODE SELECT, can apply across
>    multiple LUs.  Even more exciting, the level of sharing can in
>    principle vary between devices, even for the same page.
> 
> -- WRITE BUFFER commands can be used to change the microcode on a
>    device.  I've no idea what the implications of letting an untrusted
>    user push microcode into a device would be, but I doubt it's a good
>    idea.
> 
> -- I'm not sure whether we want to allow untrusted guests to issue SET
>    PRIORITY commands.
> 
> -- We've already been over REPORT LUNS :)
> 
> Plus whatever weird things the various device manufacturers decide to
> introduce.
> 
> What this means is that the REPORT LUNS issue fundamentally isn't
> restricted to just the REPORT LUNS command, but instead affects an
> unknown and potentially large set of other commands.  The only way I
> can see to deal with this is to white-list commands individually once
> they've been confirmed to be safe, and have the backend block any
> commands which haven't been checked yet.  That's going to be a fair
> amount of work, and it'll screw up the whole ``transparent pass
> through'' thing, but I can't see any other way of solving this problem
> safely.

I will take the approach that start with mandatory SCSI commands by
white-list, and expands the other commands.


> (And even that assumes that the hardware people got everything right.
> Most devices will be designed on the assumption that only trusted
> system components can submit CDBs, so it wouldn't surprise me if some
> of them can be made to do bad things if a malicious CDB comes in.
> There's not really a great deal we can do about this, though.)


Best regards,

-----
Jun Kamada



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] [Patch 0/7] pvSCSI driver