Re: [Xen-devel] [PATCH] Paravirt framebuffer series [0/6]

"Christian Limpach" <christian.limpach@xxxxxxxxx> writes:

> On 8/18/06, Jeremy Katz <katzj@xxxxxxxxxx> wrote:
>> The following series of patches is a roll-up of everything needed for
>> the paravirt framebuffer to work for me.
>> http://lists.xensource.com/archives/html/xen-devel/2006-07/msg00156.html
>> is the last mail about the basic idea and there's also
>> http://wiki.xensource.com/xenwiki/VirtualFramebuffer
>>
>> This set of patches includes all of the basic infrastructure from the
>> previous patchset.  Also, for infrastructure, it includes Amos
>> Waterland's work to get xenconsole using /dev/xvc0 instead of taking
>> over the primary tty devices.
>>
>> The new patches add support for launching sdlfb or vncfb from xend if
>> you've configured your guest to do so.  We also
>> set /<dompath>/console/use_graphics to let the guest know that it should
>> default to the graphical console.  Otherwise, it falls back to using the
>> xenconsole device.
>
> We've now had a chance to look through these patches.  Steven who did
> the review is unavailable for a couple of weeks, so I'm forwarding his
> comments.  I think it's absolutely necessary to address all issues
> with the protocol used between the frontend and the backend - the
> layout of any structures used on the communication rings and all
> information which gets written to xenstore.
>
>    christian

Thanks a lot for the detailed review!

As I inherited much of the code, I can't really answer some of your
questions about why certain things are done the way they're done.
I'll do what I can...

It would be useful to know which issues must be fixed before you can
merge the frame buffer, and which could be done afterwards.

> Here are the comments:
> I don't really understand why you need to use vmalloc in so many
> places, but that's could just be because I'm missing something.

Huh?  I can see just two places, both in xenfb_probe():

* the frame buffer

* the array of mfns for the frame buffer

Anthony, care to explain why?

>> +config XEN_KEYBOARD
>> +    tristate "Keyboard-device frontend driver"
>> +    depends on XEN
>> +    default y
>> +    help
>> +      The keyboard-device frontend driver allows the kernel to create a
>> +      virtual keyboard.  This keyboard can then be driven by another
>> +      domain.  If you've said Y to CONFIG_XEN_FRAMEBUFFER, you probably
>> +      want to say Y here.
>> +
> What happens if you configure XEN_KEYBOARD but not XEN_FRAMEBUFFER?

The backend refuses to connect, i.e. you can't use it.  Anthony, can
you shed some light on why you created two separate configs?

> Also, this appears to provide mouse support, and so is misnamed.

It's a device producing input events, both keyboard and mouse.  Care
to suggest a better name?

>>  config XEN_SCRUB_PAGES
>>      bool "Scrub memory before freeing it to Xen"
>>      default y
>
>> +static int xenfb_thread(void *data)
>> +{
>> +    struct xenfb_info *info = data;
>> +    DECLARE_WAITQUEUE(wait, current);
>> +
>> +    add_wait_queue(&info->wq, &wait);
>> +    for (;;) {
>> +            if (kthread_should_stop())
>> +                    break;
>> +            if (info->dirty)
>> +                    xenfb_update_screen(info);
>> +            set_current_state(TASK_INTERRUPTIBLE);
>> +            schedule();
>> +    }
>> +    remove_wait_queue(&info->wq, &wait);
>> +    return 0;
> Is there any reason this couldn't use wait_event_interruptible?

Can't see why not, but I'm looking at this code for the first time,
too.  Want me to try?

>> +}
>
>> +static void xenfb_timer(unsigned long data)
>> +{
>> +    struct xenfb_info *info = (struct xenfb_info *)data;
>> +    info->dirty++;
> Could this end up running at the same time as update_screen and race
> to update dirty?  I suppose it doesn't actually matter too much if it
> does.

xenfb_timer() runs from the timer interrupt, sets dirty, then kicks
the wait queue.

A separate kernel thread running xenfb_thread() sleeps on the wait
queue.  When it wakes up, and dirty is set, it clears dirty and
updates the screen.  Then goes back to sleep.

Obviously, dirty is always set when the thread wakes up.  If
xenfb_timer() runs before the thread goes to sleep again, the wakeup
is lost, and the value of dirty doesn't matter.  I believe that's
okay.

I clarified info->dirty++ to info->dirty = 1.

>> +    wake_up(&info->wq);
>> +}
>> +
>> +static void xenfb_refresh(struct xenfb_info *info,
>> +                      int x1, int y1, int w, int h)
>> +{
>> +    int y2, x2;
>> +
>> +    y2 = y1 + h;
>> +    x2 = x1 + w;
>> +    if (info->y2 == 0) {
>> +            info->y1 = y1;
>> +            info->y2 = y2;
>> +    }
>> +    if (info->x2 == 0) {
>> +            info->x1 = x1;
>> +            info->x2 = x2;
>> +    }
>> +
>> +    if (info->y1 > y1)
>> +            info->y1 = y1;
>> +    if (info->y2 < y2)
>> +            info->y2 = y2;
>> +    if (info->x1 > x1)
>> +            info->x1 = x1;
>> +    if (info->x2 < x2)
>> +            info->x2 = x2;
>> +
>> +    if (timer_pending(&info->refresh))
>> +            return;
>> +
>> +    mod_timer(&info->refresh, jiffies + HZ/xenfb_fps);
> So the actual refresh is sent iff the screen goes idle for a certain
> amount of time?  I guess that makes sense.

That's how I read this code.

>> +}
>> +
>
>> +static irqreturn_t xenfb_event_handler(int rq, void *dev_id,
>> +                                   struct pt_regs *regs)
>> +{
>> +    struct xenfb_info *info = dev_id;
>> +    __u32 cons, prod;
>> +
>> +    if (!info->page || !info->page->initialized)
>> +            return IRQ_NONE;
>> +
>> +    prod = info->page->in_prod;
>> +    rmb();                  /* ensure we see ring contents up to prod */
>> +    for (cons = info->page->in_cons; cons != prod; cons++) {
>> +            union xenfb_in_event *event;
>> +            event = &XENFB_RING_REF(info->page->in, cons);
>> +            notify_remote_via_evtchn(info->evtchn);
>> +    }
> I don't think you need to do a notify on every pass of the loop.  In
> fact, I don't think you need any notify_remote, or in fact any loop at
> all.  What is this trying to achieve?

Let me digress a bit.

The code implements the (informal) specification at
http://wiki.xensource.com/xenwiki/VirtualFramebuffer

That specification is designed for possible growth: the protocol
between frontend and backend can be extended by new events without
breaking backward compatibility.

Some parts of the protocol are just room for growth, e.g. when there
are no events of certain kinds defined (yet).  The code implementing
those parts is partly just to deal gracefully with future extensions,
partly pure skeleton.  If you look only at the code, this may seem
pointless, especially the skeletons.

Back to the code at hand.

I figure we need the loop because event channels don't count events.

The notification is specified by the protocol.  Why?  I don't know.  I
didn't design it I just faithfully ported the code to current Xen.

Of course, the protocol *should* be discussed now.  Anthony, I think
we could use your help there.

>> +    /* FIXME do I need a wmb() here? */
> Only if you care about not losing the non-existent XENFB incoming
> events.

There is an in event: XENFB_TYPE_SET_EVENTS.  It's ignored, though.
If I read the spec correctly, no output events should be generated
unless requested, and this is the means to request them.  Currently,
the backend always requests update events, the front end ignores such
requests, and always generates update events.  Do you want this fixed
for the initial merge?

I put the wmb() in.

>> +    info->page->in_cons = cons;
>> +
>> +    return IRQ_HANDLED;
>> +}
>> +
>> +static unsigned long vmalloc_to_mfn(void *address)
>> +{
>> +    return pfn_to_mfn(vmalloc_to_pfn(address));
> Maybe use arbitrary_virt_to_machine?

Beats me :)

ADDRESS points into memory obtained from vmalloc().  Tell me how to
translate that to mfn, and I'll fix the code.

>> +}
>> +
>> +static struct xenfb_info *xenfb_info;
>> +static int xenfb_irq;
> Is there any good reason why these need to be here?  xenfb_irq could
> be put into struct xenfb_info.

Agreed on xenfb_irq.

As far as I can see, xenfb_info is needed only for disabled code in
xenfb_resume().  If it bothers you, I'll take it out.

>> +static int __init xenfb_probe(void)
> It might be nice to break this function up a little.

It's a bit long, but the flow's straightforward.  If it bothers you, I
can try to factor some parts into functions.

>> +{
> ...
>> +    info->page->width = 800;
>> +    info->page->height = 600;
>> +    info->page->depth = 32;
> It might be nice to make these #defines somewhere.

Matter of taste.  Done.

Hmm, xenfb_mem_len should be something like width * height * depth /
CHAR_BIT, suitably rounded up.

>> +    info->page->line_length = (info->page->depth / 8) * info->page->width;
>> +    info->page->mem_length = xenfb_mem_len;
>> +    info->page->in_cons = info->page->in_prod = 0;
>> +    info->page->out_cons = info->page->out_prod = 0;
>> +
>> +    ret = bind_evtchn_to_irqhandler(info->evtchn, xenfb_event_handler,
>> +                                    0, "xenfb", info);
>> +    if (ret < 0)
>> +            // FIXME need to close evtchn?
> As far as I can see, yes.

Done.

>> +            goto error_kfree;
>> +
> ...
>> +
>> +    info->kthread = kthread_run(xenfb_thread, info, "xenfb thread");
> kthread_run can fail.

Error checking supplied.

Hmm, do we have to stop the kthread?

>> +
>> +    ret = register_framebuffer(fb_info);
>> +    if (ret)
>> +            goto error_unbind;
>> +
>> + again:
>> +    ret = xenbus_transaction_start(&xbt);
>> +    if (ret)
>> +            goto error_unreg;
>> +    ret = xenbus_printf(xbt, "vfb", "page-ref", "%lu",
>> +                        virt_to_mfn(info->page));
> Grant tables?

Elaborate, please.

>> +}
>
>> +void xenfb_resume(void)
> This (a) doesn't work (obviously) and (b) isn't called from anywhere.

Yes.  I suspect it's work in progress on suspend/resume.  Shall I take
it out for now?

>> +{
>> +#if 0 /* FIXME */
>> +    int i, ret;
>> +
>> +    xenfb_info->page = mfn_to_virt(xen_start_info->fbdev_mfn);
>> +    for (i = 0; i < xenfb_info->nr_pages; i++)
>> +            xenfb_info->mfns[i] = vmalloc_to_mfn(xenfb_info->fb + i * 
>> PAGE_SIZE);
>> +    xenfb_info->page->pd[0] = vmalloc_to_mfn(xenfb_info->mfns);
>> +
>> +    if (xenfb_irq)
>> +            unbind_from_irqhandler(xenfb_irq, NULL);
>> +
>> +    printk("xenfb: resume(%d)\n", xen_start_info->fbdev_evtchn);
>> +    ret = bind_evtchn_to_irqhandler(xen_start_info->fbdev_evtchn,
>> +                                    xenfb_event_handler, 0, "xenfb", 
>> xenfb_info);
>> +    if (ret <= 0)
>> +            return;
>> +    xenfb_irq = ret;
>> +#else
>> +    printk(KERN_DEBUG "xenfb_resume not implemented\n");
>> +#endif
>> +}
>> +
>> +static int __init xenfb_init(void)
>> +{
>> +    return xenfb_probe();
> You might want to consider using xenbus_device and friends here.

I considered that, but it wasn't obvious to me how to make it fit
cleanly to a user-space backend.  Care to advise?

>> +}
>> +
>> +module_exit(xenfb_cleanup);
>> +
>> +MODULE_LICENSE("GPL");
>
>> +static irqreturn_t input_handler(int rq, void *dev_id, struct pt_regs *regs)
>> +{
> ...
>> +                                             event->button.pressed);
>> +                    break;
>> +            case XENKBD_TYPE_KEY:
>> +                    input_report_key(dev->dev, event->key.keycode, 
>> event->key.pressed);
>> +                    break;
>> +            }
>> +
>> +            notify_remote_via_evtchn(dev->evtchn);
> Do you need to do a notify here?

Specified by the protocol.

>> +    }
>> +    input_sync(dev->dev);
>> +    /* FIXME do I need a wmb() here? */
> Depends how much you care about overflowing the ring.

I put the wmb() in.

>> +    info->in_cons = cons;
>> +
>> +    return IRQ_HANDLED;
>> +}
>> +
>> +static struct xenkbd_device *xenkbd_dev;
>> +static int xenkbd_irq;
> Why are these needed?

Same deal as for xenfb_info and xenfb_irq above, I guess.

>> +
>> +int __init xenkbd_init(void)
>> +{
> ...
>> +    ret = xenbus_transaction_end(xbt, 0);
>> +    if (ret) {
>> +            if (ret == -EAGAIN)
>> +                    goto again;
>> +            /* FIXME really retry forever? */
> It's what we do everywhere else.

Works for me then.

>> +            goto error_unreg;
>> +    }
>
>> diff -r ec03b24a2d83 -r 6ca424e1867e 
>> linux-2.6-xen-sparse/include/linux/xenfb.h
>> --- /dev/null        Thu Jan 01 00:00:00 1970 +0000
>> +++ b/linux-2.6-xen-sparse/include/linux/xenfb.h     Fri Aug 18 16:17:58 
>> 2006 -0400
> This seems to define the actual frontend<->backend protocol, so
> probably belongs in xen/include/public/io.
>
>> +
>> +/* out events */
>> +
>> +#define XENFB_TYPE_MOTION 1
> It's not obvious to me what XENFB_TYPE_MOTION would do, and it doesn't
> appear to ever get generated.

Appears to be a sketch of future work.  I don't understand it either.
Anthony?

>> +#define XENFB_TYPE_UPDATE 2
>> +
>> +struct xenfb_motion
>> +{
>> +    __u8 type;          /* XENFB_TYPE_MOTION */
>> +    __u16 x;            /* The new x coordinate */
>> +    __u16 y;            /* The new y coordinate */
>> +};
>> +
>> +struct xenfb_update
>> +{
>> +    __u8 type;          /* XENFB_TYPE_UPDATE */
>> +    __u16 x;            /* source x */
>> +    __u16 y;            /* source y */
>> +    __u16 width;        /* rect width */
>> +    __u16 height;       /* rect height */
>> +};
>> +
>> +union xenfb_out_event
>> +{
>> +    __u8 type;
> Does this really want to be in the union with the actual out_events?

Convenient way to access the type.  Sure, you could just access
update.type, but that suggests we already know it's update, which is
somewhat misleading when we don't.

>> +    struct xenfb_motion motion;
>> +    struct xenfb_update update;
>> +    char _[40];
>> +};
>> +
>> +/* in events */
>> +
>> +#define XENFB_TYPE_SET_EVENTS 1
> This is generated but there's no code anywhere to actually process it.

Yes.  See discussion of XENFB_TYPE_SET_EVENTS above.

>> +struct xenfb_page
>> +{
>> +    __u8 initialized;
>> +    __u16 width;         /* the width of the framebuffer (in pixels) */
>> +    __u16 height;        /* the height of the framebuffer (in pixels) */
>> +    __u32 line_length;   /* the length of a row of pixels (in bytes) */
>> +    __u32 mem_length;    /* the length of the framebuffer (in bytes) */
>> +    __u8 depth;          /* the depth of a pixel (in bits) */
> Is it really a good idea for these to be on the shared page?  What
> happens if the {front,back}end is malicious and modifies them while
> the framebuffer is operating normally?

Obviously, both frontend and backend should be robust against bad
shared page contents.  The shared page is writable by both, because
both need to step the rings.

The members above are for information flowing from frontend to
backend.  For safety, they should be write-only in the frontend, and
the backend should copy the info to a safer place, checking its
consistency.

I can improve safety as sketched.  Feel free to suggest a better way
to transmit the same information.

>> +
>> +    unsigned long pd[2];
> This could do with a better name.  Also, pd[1] appears to always be 0.

I suspect it's short for `page directory'.  Shall we call it pgdir?

Yes, pd[1] is always 0 and appears not to be used.  Anthony, why is pd
an array?

>> +
>> +    __u32 in_cons, in_prod;
>> +    __u32 out_cons, out_prod;
>> +
>> +    union xenfb_in_event in[XENFB_IN_RING_SIZE];
> As far as I can see, there's only ever one xenfb_in_event generated
> throughout the life of the buffer, and it's ignored.  Is this queue
> necessary?

See discussion of XENFB_TYPE_SET_EVENTS above.

>> +    union xenfb_out_event out[XENFB_OUT_RING_SIZE];
>> +};
>> +
>> +void xenfb_resume(void);
>> +
>> +#endif
>> diff -r ec03b24a2d83 -r 6ca424e1867e 
>> linux-2.6-xen-sparse/include/linux/xenkbd.h
>> --- /dev/null        Thu Jan 01 00:00:00 1970 +0000
>> +++ b/linux-2.6-xen-sparse/include/linux/xenkbd.h    Fri Aug 18 16:17:58 
>> 2006 -0400
> Again, belongs in public/io.
>
>> @@ -0,0 +1,82 @@
>
>> +/* out events */
>> +
>> +union xenkbd_out_event
>> +{
>> +    __u8 type;
>> +    char _[40];
>> +};
> Eh?

Again, the protocol is designed for extendability.  This is a
placeholder for future xenkbd output events.

>> +
>> +/* shared page */
>> +
>> +#define XENKBD_IN_RING_SIZE (2048 / 40)
>> +#define XENKBD_OUT_RING_SIZE (1024 / 40)
> Why 2048 and 1024?

The shared page is carved up into three parts: info at offset 0, size
up to 1024, input ring at offset 1024, size 2*1024, output ring at
offset 3*1024, size 1024.

I fear the code doesn't implement the spec... rings follow info
immediately, so they'll wander when info is extended.  Want me to fix
this?

Same issue with xenfb.h, with ring sizes swapped.

>> +
>> +#define XENKBD_RING_SIZE(ring) (sizeof((ring)) / sizeof(*(ring)))
>> +#define XENKBD_RING_REF(ring, idx) (ring)[(idx) % XENKBD_RING_SIZE((ring))]
>> +
>> +struct xenkbd_info
>> +{
>> +    __u8 initialized;
>> +    __u32 in_cons, in_prod;
>> +    __u32 out_cons, out_prod;
>> +
>> +    union xenkbd_in_event in[XENKBD_IN_RING_SIZE];
>> +    union xenkbd_out_event out[XENKBD_OUT_RING_SIZE];
> What's the out ring for?

Placeholder for xenkbd output events.

>> +};
>> +
>> +void xenkbd_resume(void);
>> +
>> +#endif

[Review of backend...]

I'll reply to this separately.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] [PATCH] Paravirt framebuffer series [0/6]