This patch set provides new function for disk I/O management.
This function controls dispatching I/O requests of virtual block devices
(blkback/blktap) in order to guard or to discriminate I/O performance among
This time, guarding and discriminating I/O performance is realized based on the
number of dispatching requests.
This patch set includes following two parts.
- A control framework for virtual I/O requests.
This framework provides interface to control virtual I/O requests in
blkback and blktap mechanism.
A virtual I/O request controller can be set/removed as a module.
- A simple virtual I/O request scheduler.
This scheduler controls virtual I/O requests based on the number of
requests per turn.
This makes blkback/blktap threads wait when their share per turn is
A share means the amount of request per turn that allocated for each
virtual block devices.
Therefore, virtual I/O requests can be controlled in proportion to
shares of blkback/blktap threads.
The reason why we made this function is that an ability of I/O resource
management is necessary for Xen.
Xen has differentiated services for CPU and memory (maybe network), but not I/O.
( Maybe, there are control function based on CFQ currently in XenEnterprise. ).
So, we developed I/O control function for blkback/blktap.
I/O performance of each blkback/blktap thread is guarded in proportion to
shares of all threads and is avoided the influence of other domain's I/O.
We think that there are two control places in Xen: dispatching place in
blkback/blktap and I/O scheduler within Linux.
The former have advantage which is located Xen architecture (backend - frontend
form), while the latter have advantage which is to use unmodified Linux I/O
Virtualization should be OS agnostic, so this patch set realizes I/O control at
the former place.
The first part is control framework.
The aim of the control framework is to control either blkback or blktap, and
both by one control module.
This make easy to develop and test management functions.
The control framework is constructed as follows.
iomgr === control module
"iomgr" is the core of the control framework, and it connects a control
module and blkback/blktap.
In addition, it counts the number of total pending requests.
The second part is simple scheduler.
It is one of implementation as a control modules.
Our scheduler controls the number of virtual I/O requests per turn.
Now, backend driver is processed in Domain 0 which is Linux and I/O
performance is affected by I/O schedulers.
To differentiate I/O performance, dispatching I/O requests should be stopped.
Our scheduler controls I/O requests based on the number of dispatching I/O
If blkback/blktap threads are finished their shares, they are waiting until
that all threads finish their own shares.
Exceptionally, in the case that any threads remain their shares, but every
threads have no requests, turn moves new round and all threads restore their
own max shares.
Our scheduler judges whether there are pending requests or not.
We call this scheduler "turn-based scheduler".
We think that a control module will be expanded other I/O control function: for
example, absolute control, through-put control, access control management and
The procedure to enable our I/O management is as follows.
1. Enable relevant config options (CONFIG_XEN_IOMGR)
and runtime configuration kernel module (CONFIG_XEN_IOSCHED_TURN).
2. Build and boot Xen framework and this kernel.
3. Insert control modules into domain-0 (input "modprobe turn_iosched"
4. Configure ability of virtual block devices.
Interface to configure I/O request share of each virtual block device
is represented as sysfs.
When you want to check their shares, you can get their shares by
reading commands, such as cat, less and so on.
When you want to set up their shares, you can set up their shares by
writing operations, such as echo and redirection.
For example, in the domain 1 with blktap xvda,
shows a maximum share per turn.
Default is 64. (This value can be changed by rewriting
To configure share for its virtual block devices, write a value
into this entry.
echo 128 > /sys/devices/xen-backend/tap-1-51712/iomgr/max_cap.
shows a remaining share at current turn. This entry is read only.
Xen-devel mailing list