Big update… Before leaving ccu, Gordon made a huge effort and got the Windows version up and running - I'll get some screenshots up here ASAP.
In the meantime I have been debugging and getting the Linux version ready. There is still a nasty bug (that keeps moving about, so prob. memory corruption) but heres some nice screenshots of MeeGo running. I had to hack the VGA BIOS in order to get the 864×480 screen mode, and it worked!
Qemu running MeeGo and Fennec
Qemu prior to fixing the display resolution
Turns out that its not glReadPixels that was the problem, it is in fact that we are inserting this call into what is effectively the guest processes rendering pipeline, even though we're really running on the host. Since Clutter is setting GL_PACK_ROW_LENGTH non-zero (among other things) this causes our glReadPixels() call to do the wrong thing (ignoring our width value). A simple solution is to save / restore this value, but its not the only potential troublemaker. Really we need to save / restore all the values that affect glReadPixels(). Tests show that using XGetImage() (on hosts where this trick works) is faster than glReadPixels() so I'll probably add code to do that via XShm and save us (yet another) copying of the data.
I've also noticed some artifacts under Q3 which I suspect might be related to our attempts to suppress double-buffering (as its not required - we effectively copy each frame once its rendered anyway. My guess is that we're copying the frame more often than required, during its rendering, which is resulting in our seeing it 'flash'. This wouldnt be visible in something like clutter on the whole, so we may see some more performance there, too.
It's 10 past 6 in the evening, and the fragile peace here is suddenly shattered by the cry “It works!”.
Finally free from horrible corruption!
The problem turned out to be the use of glReadPixels() in order to extract the pixel data from the host. It appears that this upsets clutters rendering pipeline somehow. For now, switching to use of XGetImage() seems to be a useable solution, and using XShm should speed things up still further. Performance using XGetImage is better than glReadPixels() and I'm now seeing ~100FPS on my fairly modest hardware.
Heres a gratuitous screenshot of MeeGo 1.0 *actually* running in useable form on qemu, with KVM, on my machine:
So another productive day of nailing bugs down…
I now have clutter fixed so that it works properly in both 32 and 24bpp modes, which covers most of qemus available VGA emulations. Screenshots of meego in (respectively) 24 and 32 bpp (cirrus and std VGA):
My next task is to figure out why clutter is causing this to happen. Seems to be losing track of pixmaps and rendering garbage…
I rock :)
Bug narrowed down to clutter, nothing to do with the openGL patch. clutters handling of 24bpp images in fallback (non TFP) mode is completely bong:
for (ypos=0; ypos<height; ypos++)
for (xpos=0; xpos<width; xpos++)
{
char *p = first_pixel + image->bytes_per_line*ypos
+ xpos * 4;
p[3] = 0xFF;
}
data = first_pixel;
bytes_per_line = image->bytes_per_line;
Changing this to:
data = g_malloc (height * width * 4);
data_allocated = TRUE;
bytes_per_line = width*4;
for (ypos=0; ypos<height; ypos++)
for (xpos=0; xpos<width; xpos++){
char *p = first_pixel + image->bytes_per_line*ypos
+ xpos * 3;
char *dst_p = data + bytes_per_line * ypos + xpos * 4;
dst_p[0] = p[0]; dst_p[1] = p[1]; dst_p[2] = p[2]; dst_p[3] = 0xff;
}
results in everything working!
The Meego image is now bootable. It doesnt last long (in fact, the first click nuked it) but its certainly getting there…
So Gordon and I have been working hard on the gloffscreen rendering concept for qemu, and just this morning we merged our codebases, resulting in something that can run glxgears and Q3 without crashing, but more importantly, mutter works! (albeit with some corruption, possibly due to misconfigured fbconfigs).
So heres some news - a version that actually works. Window resizing isn't supported yet, but it certainly proves what we can do here… The screenshot shows everything working - the Qemu display is scaled and the GL windows have scaled with it. There are two windows overlapping without destroying the window managers borders, and the mouse cursor is on top of everything, where it should be. (the guest OSes cursor, that is - its on top of the green cog in the upper window. You can see the host cursor on top of the QEMU titlebar…)
So I've spent half a week trying to figure out why XShmPutImage wasnt working (all its parameters looked good) only to find out that it was caused by the →data pointer in my Ximage being uninitialised. Now that thats fixed, look what we can do! Particularly, we have overlapped windows and a mouse cursor that isnt underneath the host machines GL windows, and thus visible) (not quite there yet on the display format, but…)
So, an update… Since the early work in 2009 I've cleaned up the guest libGL and the host-side code, implemented a virtio kernel module for the guest, and eliminated the old int99 hack and other transports. This means we can now have accelerated 3D and KVM, all in the qemu window. Speed is excellent now that I've re-implemented the command-queuing mode, with glxgears running near the hosts speed, and Q3 jumping from 6FPS without command queuing to 90+ (identical to the host) with it.
Its not all perfect though. due to the fact that the rendering is done on the host in top-level windows inside the qemu window, games and single-window GL apps work very well - but it all goes a bit wonky when you try to overlap windows.
It would be *possible* to make this work using this approach, but it'd be hairy as hell, so instead, I'm going to render into a buffer and copy this to the guest, to be blitted into its window. This solves a lot of problems, and since I'm after rendering correctness, thats a good thing. Specifically, problems like the window overlapping issue just go away because the guest OSes X-server will be just blitting pixmaps for us, and halted GL processes on the guest wont leave a GL rectangle on screen whilst the window-manager cheerfully moves the window border around…
I've begun work on coding this and the screenshot below proves the concept - the rear window is rendered into a pixmap and the foreground one correctly overlaps it. The tearing is, surprisingly, a side-effect of using X to screenshot the scene.
A feasability study in order to see if this approach could work. I took this (2 year old) patch to qemu and began updating it.
Accelerated 3D is a feature that's been missing from virtual machines generally, and will allow easier/sandboxed testing of new 3D platforms and applications.
Although the qemu-native (int 0x99) method of passing the openGL data from guest to host didnt survive the quick forward port, the TCP based rendering forwarding did, and works pretty well (70fps in quake3 on a Lenovo X200 laptop, compared to ~85fps native). If this were to be adopted and modified to use virtio I would expect performance to be nearly the same as native.