Welcome to the first issue of what will hopefully be weekly summaries of traffic on the QEMU mailing list. This week, the big news is the 0.9.1 release finally came out, one year after 0.9.0.
Two as yet unmerged patches improve DVD support (both to read DVD movies and to boot the GNU/Solaris live CD), an RTC dyntick patch reduced the CPU usage of idle qemu instances, and the first stab at a third type of USB support (EHCI) showed up on the list.
Processors access memory in several different ways. For example, ARM doesn't allow unaligned reads and writes, and an atomic 32 bit read from an I/O register is not equivalent to four 8-bit reads. Newer processors support 64 bit memory access, and some I/O devices (especially on SPARC hardware) require them.
Robert Reif recently ran into such hardware on the sparc platform, and whipped up a patch to add 64-bit I/O functions to QEMU:
Sparc32 has a 64 bit counter that should be read and written as 64 bits but that isn't supported in QEMU. I did a quick hack to add 64 bit i/o and converted sparc32 to use it and it seems to work.
Robert's "quick hack" is actually a fairly extensive patch, including a general reorganization of the existing memory access functions to more carefully distinguish between the different access sizes, and several related cleanups.
Robert's patch makes memory access functions come in groups of four, with their names ending with "b" for byte (8 bits), "w" for word (16 bits), "l" for long (32 bits), and "q" for quadword (64 bits). The 28 new low level memory access functions are unassigned_mem_{read,write}[bwlq](), notdirty_mem_write[bwlq](), watch_mem_{read,write}[bwlq](), and subpage_{read,write}[bwlq]().
Most users of these low level functions call them through data structures of type "CPU{Write,Read}MemoryFuncs". These used to be arrays of three function pointers (indexed by the size of the access). The patch converts them to structures with four function pointer members named "b", "w", "l", and "q". So code that used to look like io_mem_write[io_index][0](...) might now look like io_mem_write[io_index].b(...)". Various readlen() and writelen() wrapper functions, which took as arguments a CPU*MemoryFuncs pointer an integer index to indicate the access size, went away in favor of directly dereferencing these structures and calling the appropriately sized member function. Several functions that used to return a uint32_t value now return a uint16_t or uint8_t as appropraite.
The patch changes the top level memory access function, "cpu_physical_memory_rw()", to use the new functions and to understand 64 bit reads and writes. A lot of driver code also uses the low level accessor functions directly; their structures were updated to the new format but usually have NULL for the 64 bit accessor function (at least until the driver is examined to see what behavior a 64-bit memory access should have).
Robert's patch hasn't been applied yet, but he has this to say about it:
I'm suppling the sparc changes to get comments on if this is worth pursuing. This has only received enough testing to verify that the 64 bit counter was being read and written as 64 bits. I was able to run the linux kernel in sparc-test. QEMU will crash on accessing an unsupported size because the latest io changes are partially disabled until I figure out a clean way to call do_unassigned_access. A lot of the embedded processors use the same function for accessing all sizes so converting them will be painful.
Robert later added:
...other architectures with 64 bit hardware have similar requirements. This is a generic solution that fills a hole in the qemu implementation.
Brian Johnson agreed with the patch:
Some non-PC hardware has 64-bit registers which need to be accessed as 64-bit quantities, in order to read or write all fields at once. Qemu should support 64-bit I/O.
On Sparc64 many registers can only be handled using 64 bit accesses, other methods should raise an exception.
Some related work occurred in another thread
Dan Kenigsberg resubmitted a patch he'd previously submitted in December. There was no discussion, and it didn't show up in the repository.
i think ive found a little bug on the usb mouse emulation (only with the mouse, not the tablet), the wheel doesnt seem to work (it just doesnt do anything). Ive tracked the problem down to usb-hid.c...
That descriptor doesnt contain a 'wheel' section so when the virtualized OS gives me the TD it only has space for dx and dy and not for the wheel delta (its got a maximum length of 3). To fix the problem i just modified the descriptor...
Now the virtualized OS gives me a TD with a maximum length of 4 and the wheel works just fine.
Clemens Kolbitsch started a thread about using performance monitoring tools under qemu. Paul Brook noted that QEMU doesn't emulate performance counters, and continued:
Besides which, qemu is not cycle accurate. Any performance measurements your make are pretty much meaningless, and bear absolutely no relationship to real hardware.
In a later message, Paul continued:
With the exception of some very small embedded cores, Modern CPUs have complex out of order execution pipelines and multi-level cache hierarchies. It's common for performance to be dominated by these secondary factors rather than raw instruction throughput.
Exactly what features dominate performance is very application specific. Determining which factor dominates is unlikely to be something qemu can help with.
However if e.g. you know that for your application there's a good correlation was between performance and L2 cache misses you could instrument qemu to and a L1/L2 cache model. The overhead will be fairly severe (easily 10x slower), and completely screw up any realtime measurements. However it would produce some useful cache use statistics that you could use to guesstimate actual performance. This is similar to how cachegrind works. Obviously if your application isn't cache bound then these figures will be meaningless.
Rob Landley had some more details.
John W showed interest in adding support for the second ethernet interface on a gumstix board, got some debugging help from andrzej zabrowski, and submitted a working patch a few days later.
Carlo Marcelo Arenas Belon reposted a patch to improve DVD-ROM probing in a way that allows the GNU/Solaris live CD to boot under QEMU.
One idiosyncrasy that emerged during the patch review is that QEMU uses a simple heuristic to determine whether a virtual DVD drive has a CD or DVD in it, by checking the size of the supplied image file and identifying anything below a certain size as a CD and anything larger as a DVD.
A few days later, Carlo posted
Ryan W. Smith joined the long line of people confused by where the gen_op functions come from, and Blue Swirl gave a quick summary of the gen_op build process:
op.c is compiled and the resulting object file op.o is processed by dyngen program, producing gen-op.h, opc.h, and op.h. These define the gen_op* versions of the functions, originally op_something in op.c.
Daniel P. Berrange rejected a patch:
Providing a password which is clearly visible on the command line to all users of a machine is unacceptable. This capability was *explicitly* left out when I did the original VNC password support in QEMU.
He agreed that providing a password from a file might be useful, but suggested a more generic implementation applying to qcow encrypted disks as well. There was no reply.
Laurent Vivier posted a patch to use SCSI passthrough to read movies from DVD. Fabrice Bellard objected that the patch was implemented at the wrong layer, and Laurent promised to redo it.
Alexey Eremenko started a longish thread by asking for a qemu bugzilla. Several sites capable of providing such a service were mentioned, but nobody actually did anything about it. Due to qemu's avowed "alpha" status and rapid development, the QEMU developers currently seem happiest receiving bug reports on the mailing list.
After the suggestion was made that a bug tracker was more appropriate to a project with a "stable" branch, Lauro Ramos Venancio volunteered to maintain such a stable branch for QEMU.
Fabrice Bellard announced the release of version 0.9.1.
Carlo Marcelo Arenas Belon pointed out that his patch to boot GNU/Solaris had just missed the release, but applied cleanly to it.
Juergen Lock wrote about a number of problems running FreeBSD under QEMU and wondered if anyone else cared. Nobody responded specifically about FreeBSD, but Carlo Marcelo Arenas Belon noted:
I never though slirp will ever work in an amd64 (judging by all the pointer <-> integer size mismatches) or any other LP64 architecture, regardless of the guest OS being used, which is why I never even bother to test.
Laurent Vivier commented:
Hi, just a comment: on linux, I've been able to burn a CD with the SCSI passthrough...
Juergen Lock noticed some speed improvements in 0.9.1:
I just played with -drive if=scsi and -kernel-kqemu in a linux guest and a dd bs=64k from a 5MB file to /dev/null got me more than 25 MB/s! While a similar dd off the emulated ide cdrom drive (I was using a livecd iso, sidux-2007-04.5-200712260120-eros_xmas-kde-lite-i386.iso) only gets me about a tenth of that. Can anyone reproduce this? :) This is the first time I've seen qemu doing more than a few MB/s IO on this box...
Anders Melchiorsen posted:
This patch stops the qemu seconds timer when the RTC is not used. Once the cmos is accessed, the skipped seconds are accounted for, and the timer is enabled again.
In case interrupts are to be delivered, the timer is not disabled.
This change eliminates the last host-induced periodic timer in my kvm setup.
From Arnon Gilboa: a preliminary patch implementing USB 2.0 EHCI emulation (in addition to the UHCI and EHCI already in QEMU).
Paul Brook objected to the timing requirements:
Qemu isn't capable of this kind of realtime response. You need to figure out an implementation that doesn't require extremely fine grained timers. In paractice you're unlikely to get better than 10ms timer resolution, and 100ms latency isn't that uncommon.
Dor Laor replied:
It still works even without accurate timing demands. Only isochronous mode will have problems and it is not yet supported for ehci.
To which Paul replied:
It could also cause problems for periodic interrupt transfers. It's not uncommon for linux hosts to have a minimum timer period of 10ms (100Hz). This means the periodic list will be traversed 80x slower than it should, so a typical for a mouse or tablet with a 10ms poll interval will only be polled every 800ms. 800ms lag on a mouse is unacceptable.
The existing USB hosts have similar issues. However the problem is an order of magnitude less severe, so isn't noticeable under normal circumstances.
Dor also said:
Latest Linux host compiled HR_TIMER and DYN_TICK can give pretty good accuracy, surely it can provide 1khz and probably even 8khz
To which Paul replied:
Only if the host is lightly loaded. qemu tends to use a lot of CPU, so scheduler heuristics will tend to give it a low priority. c.f. an mp3 player that uses a small amount of CPU, so the scheduler will try hard to provide prompt signal delivery.
Most bugfixes and platform stuff, plus 0.9.1 release.
This week's commits: