Ian Jackson wrote:
 
Anthony Liguori writes ("[Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle 
terminating signals."):
  
The race I know of is that you may get an aio signal completion before 
select but after you've already qemu_aio_poll()'d.  In practice, we only 
sleep for 10ms at a time in select() so the race is handled by that.  If 
we wanted to increase the amount of time we slept, we would have to 
handle this race.
    
 
Yes.  And, 10ms is too long anyway for reasonable performance.  During
my merge with upstream I found that the qemu aio functionality (which
was done quite differently to the old xen ioemu) caused a severe
performance regression under some conditions because of this race.
   
 
Yeah, noticed that too, especially with qcow2.
 In KVM, we sleep for 1s in select() and use signalfd() to receive the 
aio notifications.  For older hosts, we emulate signalfd using a thread 
and the pipe-to-self trick.
    
 
Why does it need a thread ?  You can just write to the pipe in the
signal handler.  I'll post my code.
   
 
 It's a little less perfect.  Your signal handler can write() to the pipe 
but what happens if you get EAGAIN?
 So what we do in KVM is use sigwait() within a separate thread.  We 
don't set O_NONBLOCK so the thread blocks if the pipe fills up which is 
exactly the semantics you would want.
 Below is our implementation.  I'll queue up to push this change into 
QEMU after I finish with the migration patches.
Regards,
Anthony Liguori
#include <sys/syscall.h>
#include <pthread.h>
struct sigfd_compat_info
{
   sigset_t mask;
   int fd;
};
static void *sigwait_compat(void *opaque)
{
   struct sigfd_compat_info *info = opaque;
   int err;
   sigset_t all;
   sigfillset(&all);
   sigprocmask(SIG_BLOCK, &all, NULL);
   do {
   siginfo_t siginfo;
   err = sigwaitinfo(&info->mask, &siginfo);
   if (err == -1 && errno == EINTR) {
           err = 0;
           continue;
       }
   if (err > 0) {
       char buffer[128];
       size_t offset = 0;
       memcpy(buffer, &err, sizeof(err));
       while (offset < sizeof(buffer)) {
       ssize_t len;
       len = write(info->fd, buffer + offset,
               sizeof(buffer) - offset);
       if (len == -1 && errno == EINTR)
           continue;
       if (len <= 0) {
           err = -1;
           break;
       }
       offset += len;
       }
   }
   } while (err >= 0);
   return NULL;
}
static int kvm_signalfd_compat(const sigset_t *mask)
{
   pthread_attr_t attr;
   pthread_t tid;
   struct sigfd_compat_info *info;
   int fds[2];
   info = malloc(sizeof(*info));
   if (info == NULL) {
   errno = ENOMEM;
   return -1;
   }
   if (pipe(fds) == -1) {
   free(info);
   return -1;
   }
   memcpy(&info->mask, mask, sizeof(*mask));
   info->fd = fds[1];
   pthread_attr_init(&attr);
   pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
   pthread_create(&tid, &attr, sigwait_compat, info);
   pthread_attr_destroy(&attr);
   return fds[0];
}
int kvm_signalfd(const sigset_t *mask)
{
#if defined(SYS_signalfd)
   int ret;
   ret = syscall(SYS_signalfd, -1, mask, _NSIG / 8);
   if (!(ret == -1 && errno == ENOSYS))
   return ret;
#endif
   return kvm_signalfd_compat(mask);
}
Ian.
   
 
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
 
 |