Re: [Xen-devel] HELP required with some ideas

On Mon, Aug 30, 2010 at 02:15:26AM +0530, grapgroup grapgroup wrote:
> On Sun, Aug 29, 2010 at 7:35 PM, Pasi Kärkkäinen <pasik@xxxxxx> wrote:
> 
> > On Sun, Aug 29, 2010 at 06:17:06PM +0530, grapgroup grapgroup wrote:
> > >    Hi,
> > >     We are a group of four students studying in an undergraduate college.
> > >     We are new to XEN and we would like to contribute to the development
> > of
> > >    XEN through our college final year project.
> > >     We have gone through a few research papers and have shortlisted a few
> > >    ideas out of which we are going to finalize the project.
> > >     As we are beginners we would be very grateful if you could guide us
> > in
> > >    any of the following ways :
> > >
> >
> > Hello!
> >
> > Could you send the links the papers you mention?
> > Some comments below..
> >
> > >    1)  telling us if the idea is already implemented in
> > >    XEN.                                                       OR
> > >    2)  if the idea is implemented then suggesting any modifications which
> > can
> > >    be done in it.       OR
> > >    3)  telling the feasibility of the idea.
> > >
> > >    We would be very thankful if you could guide us in any way.
> > >    We would also like to think on any ideas suggested by you.
> > >
> > >    Regards,
> > >      Rohan Malpani
> > >      Ammar Ekbote
> > >      Paresh Nakhe
> > >      Gaurav Jain
> > >
> > >    *******************************IDEAS*****************************
> > >    1) Disk I/O scheduling on virtual machines
> > >
> > >        Scheduling algorithms for native OS are designed keeping in mind
> > the
> > >    latency characteristics of the disk. In virtual environment, a
> > >    VM will have a virtual disk which is physical space on the physical
> > disk.
> > >    Therefore, the same algorithms do not work well on virtual
> > >    machines. There is a need of new scheduling algorithms for VMs which
> > will
> > >    take into account the type of workload and perform schduling in
> > >    such a way so as to increase the preformance. The paper we referred
> > >    suggested using two level scheduling, one at the VM level and other at
> > >    the hypervisor level.
> > >
> >
> > Have you guys looked at projects like dm-ioband ?
> >
> >
> > >    2) Network Interface Virtualization
> > >
> > >        There is a particular mechanism in XEN called 'Page grant
> > mechanism'
> > >    to achieve network interface virtualization. In this
> > >    mechanism there is considerable s/w overhead as for each I/O, access
> > to
> > >    certain guest pages(I/O buffer) is granted to driver domain and is
> > >    immediately revoked as soon as the i/o is complete. Current mechanism
> > is
> > >    said to be giving  a performance 2.9 Gb/s on 10 Gb/s line. The paper
> > >    we referred suggested a mechanism where this s/w overhead can be
> > reduced
> > >    to a great extent.
> > >    First  is implementation of multi-queue NIC support for the driver
> > domain
> > >    model in Xen and other is grant reuse mechanism based on
> > >    software I/O address translation table. In this,once the access to
> > guest
> > >    pages is granted it is reused for multiple i/o transactions.
> > >
> >
> > Some of this stuff is done in the xen 'netchannel2' development.
> >
> > I think there are multiple presentations about possible xen network
> > improvements available from XenSummit slides.
> >
> > >    3) Asymmetry aware hypervisor
> > >
> > >        Experiments show that asymmetric multi-core processors are more
> > >    efficient than the SMP. Idea is to deliver better performance
> > >    per watt and per area. The paper suggests that each VM running on the
> > >    hypervisor has some number of fast vCPUs and some number of slow
> > >    vCPUs. Each task is identified for its type and accordingly sent to
> > fast
> > >    or slow vCPU. CPU intensive applications are scheduled on fast
> > >    vCPUs and memory intensive applications are scheduled on slow vCPUs.
> > These
> > >    vCPUs are mapped to the corresponding type of physical
> > >    core. Hypervisor needs to modified to become asymmetry aware. The
> > goals of
> > >    such a hypervisor are
> > >
> > >    1.fair sharing of fast cores among all vCPUs in the system;
> > >    2.support for "asymmetry aware" guests;
> > >    3.a mechanism for controlling priority of VMs in using fast cores;
> > >    4.a mechanism ensuring that fast cores never go idle before slow cores
> >
> > Hmm.. do you mean NUMA aware hypervisor/VMs, or something else?
> >
> > -- Pasi
> >
> >
> Hello,
>  First of all we would like to thank you for sparing your time and looking
> at the suggested ideas.
> 
>   We have mentioned below the links for the papers regarding the ideas.
>   We have also gone through the topics which you mentioned and we have
> summarized below what we found.
>   We have elaborated two ideas a bit further which would give their clear
> picture.
> 
>    Our concerns regarding all of these ideas are that whether they are
> feasible as a 7-8 months project and are they already implemented
> elsewhere.
>    Any suggestions extending or modifying these ideas would prove to be of
> great help.
> 
> Links to papers:
> 
> 1)On Disk I/O Scheduling in Virtual Machines :
> http://sysrun.haifa.il.ibm.com/hrl/wiov2010/papers/kesavan.pdf
> 2) Network Interface Virtualization :
> http://www.cs.rice.edu/CS/Architecture/docs/ram-vee09.pdf
> 3) Asymmetric aware hypervisors :
> http://www.cs.sfu.ca/~fedorova/papers/vee04-kazempour.pdf<http://www.cs.sfu.ca/%7Efedorova/papers/vee04-kazempour.pdf>
> 
> Regards,
>     Rohan Malpani
>     Ammar Ekbote
>     Paresh Nakhe
>     Gaurav Jain
> 
> 
> ***************************************************************************************************************************************************************************
> 
> *1) dm-ioband * (in context of the first idea : On Disk I/O Scheduling in
> Virtual Machines)
>      dm-ioband is an I/O bandwidth controller implemented as a device-mapper
> driver and can control bandwidth on per partition, per user, per process ,
> per virtual machine (such as KVM or Xen) basis. Our suggested idea does not
> revolve around I/O scheduling between VMs but its related to disk I/O
> scheduling carried at different levels in virtualized environments.
> 
> 
> *Further elaboration of the first idea*  (On Disk I/O Scheduling in Virtual
> Machines)
> 
>      The suggested idea intends to introduce a disk I/O scheduling algorithm
> in the hypervisor which would take into consideration the disk I/O
> scheduling in the guest VM.
> 
> 
>  The scenario is as follows :
> 
> To read or write, the disk head must be positioned at the desired track and
> at the beginning of the desired sector and in doing so we encounter seek
> time and rotational delay.
> For a single disk there will be a number of I/O requests
> If requests are selected randomly, we will poor performance.

That is not entirely true. Think SSDs, where random writes are not a
problem anymore. Also NCQ or SWCQ address this by the SATA interface
deciding in which order the sectors are writen and telling the control
(ahci for example) which of them sectors have been written. In other
words, the elevator logic has been moved down to the harddrive.

> So we have to "reorder" (using various algorithms) these requests to
> minimize the seek time by making the head move in an optimized way.
> 
> The various algorithms used for these purposes:
> 
> 1) First-in, first-out (FIFO) : Process request sequentially
> 2) Shortest Service Time First : Select the disk I/O request that requires
> the least movement of the disk arm from its current position
> 3) SCAN : Arm moves in one direction only, satisfying all outstanding
> requests until it reaches the last track in that direction and then
> Direction is reversed
> 4) C-SCAN : Restricts scanning to one direction only.When the last track has
> been visited in one direction, the arm is returned to the opposite end of
> the disk and the scan begins again.
> 
> 
> Now the problem in virtualized environments is that there are two levels of
> disk I/O scheduling;
> 1) at the guest VM level (domU)
> 2) at the hypervisor level (dom0)
> 
> Now due to this if the scheduling is carried in hypervisor (dom0) then the
> scheduling carried at the guest VM level (dom0) will be of no use.
> e.g if guest uses FIFO and dom0 uses Shortest-service time first then
> ultimately the  Shortest-service first will be considered and the FIFO
> scheduling will be wasted.
> 
> Currently in XEN we have the follwing pattern:
> 
> 1) *at guest VM: The NOOP scheduler* : It is the most basic scheduler that
> inserts all incoming I/O requests into a simple, unordered FIFO queue and
> implements request merging.
> 
> 2) *at dom0 : uses cfq scheduler*

Take a look at Vivek Goyals' talk on the recent LSF/MM mini-summit:

http://lwn.net/Articles/400589/

(unfortunatly it doesn't have the slides, maybe you can email him for
more details).
> 
> 
> *The Idea:*
> 
> So effectively no scheduling is carried at the guest level in XEN.
> However the paper suggests that it is better if the guest VM schedules the
> tasks according to its need ( some VMs may run applications requiring
> sequential disk I/O thus needing shortest-seek time while others may
> berunning applications running randomized data  and may be need SCAN ).

OK, but that won't be a problem with SSDs where both values are about
the same (ok, sequential disk I/O will be higher, but not that much).

>  Thus it is more useful if the guest VMs carry out the scheduling at their
> level. In such a case it becomes necessary to make the hypervisor aware that
> scheduling has already been carried out in the guests and not to carry any
> scheduling in the dom0 level.

So re-priroties the I/O in the guest. I was under the impression that
'io-nice' would be doing exactly that - prioritizing I/Os from specific
applications? Which means re-prioritizing the I/O queue with more
important I/O, which are then feed in the NOOP I/O scheduler?

>  Thus we want to make modifications to make XEN aware if scheduling has been
> carried at the guest VM level or not and then accordingly apply its disk I/O
> scheduling policy thereby not disturbing the scheduling carried at guest VM
> level.
> 
> /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
> 
> 2) about netchannel2 (in context of the second idea : Network Interface
> Virtualization)
> 
> We went through netchannel2 and found that what you said was right. A
> considerable part of the idea was already implemented in netchannel2.
> We would go through netchannel2 and find if anything more could be done in
> it.
> Also any suggestions regarding what more could be done would be of great
> help.

Well, NetChannel2 is dead. The code hasn't been upported to PV-OPS so
unless somebody looks at it, it won't be in PVOPS kernels. But that is a
seperate discussion.

> 
> /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
> 
> 3) about NUMA-aware hypervisors. (in context of 3rd idea : Asymmetry aware
> hypervisor)
> 
> We went through NUMA-aware hypervisors and found that our suggested idea
> does not address NUMA aware hypervisors. However, we might consider that
> idea if it is not yet done.
> 
> *Further elaboration of the idea (Asymmetry aware hypervisor**):
> *
> To ensure that asymmetric hardware is well utilized, the system must match
> each  thread  with the right type of core:
> e.g., memory-intensive threads with slow cores and computer intensive
> threads with fast cores.
> 
> This can be accomplished by an asymmetry-aware thread scheduler in the guest
> operating system, where properties of individual threads can be monitored
> more easily than at the hypervisor level.  However, if the hypervisor is not
> asymmetry-aware it can thwart the efforts of the asymmetry-aware
> guest OS scheduler, for instance if it consistently maps the virtual CPU
> (vCPU)  that the guest believes to be fast to a physical core that is
> actually slow.
> 
> This paper focuses on the enabling asymmetric core support in hypervisors.
> Asymmetric cores (dissimilar cores) are seen to give better efficiency than
> Multi-core processors having similar cores. In such a case particular kind
> of processes can be handled by one group of asymmetric cores while other
> kind of processes can be handled by other group of cores.
> 
> Operating systems have schedulers which can multiplex the threads to the
> different cores. However currently hypervisors do not have scheduling based
> on the asymmetric nature of the cores.
> 
> *THE IDEA:*
> 
> The paper proposes to make the hypervisor aware of this asymmetric nature.
> The paper proposes to map the vCPU's (present in the VMs) to physical cores
> of the same kind. i.e. fast vCPU's will be mapped to fast physical cores and
> slow vCPU's will be mapped to slow physical cores.
> Thus due to this appropriate mapping the threads in the VMs would be
> serviced by the appropriate set of cores.
> 
> ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] HELP required with some ideas