Another week of digesting TCG, including the conversion of sparc32 over to TCG. Development on other parts of the code showed a few signs of life, notably an update to sh4 allowing it to actually run a few programs via application emulation, and tweaks to the IDE disk cache flushing code.
Ralf Baeurwaldt objected to the fact that the prebuilt qemu binaries on the website were built for a 32 bit host, even when emulating a 64 bit target.
A longish thread ensued, which boiled down to A) the qemu developers not wanting to host more verions of the prebuilt binaries than necessary, B) 32 bit binaries running on x86-64 hosts just fine, and C) people who want to optimize performance being able to build it from source themselves.
An otherwise uninteresting thread became interesting when Paul Brook stated:
> > Finally, it would perhaps be best if the block device emulators wrote > > to the qemu console to complain if they give write errors. Otherwise > > the errno value and other important information will be lost, which > > makes debugging hard. > > If by 'qemu console' you mean stderr, then fine, but please don't > spew log messages to the monitor console, because that'll make it > very hard to interact with reliably from management tools. Actually I think a better default would be for qemu to die right there and then. If the host is getting IO errors then something is badly wrong, and you're probably screwed anyway.
When Ian Jackson disagreed, Paul Brook responded:
> Write errors for non-raw formats can easily be caused by a disk full > situation on the host. Killing the guest hard is unfriendly for that > situation. Disk full is a fundamentally unfriendly situation to be in. There is no good answer. Reporting errors back to the host has its own set of problems. Many guest OS effectively just lock up when this occurs.
Ian disagreed again, and Jamie Lokier responded:
> I think that's fine, surely ? A locked up guest isn't very nice but > it's better than a guest shot dead. Well, a guest which receives an IDE write error might do things like mark parts of the device bad, to avoiding writing to those parts. If the guest is running software RAID, for example, it will radically change its behaviour in response to those errors. Sometimes that's what you want, but sometimes it is really unwanted. If the host runs out of disk space, ideally you might want to suspend the guest until you can free up host disk space (or move to another host), then resume the guest, perhaps manually.
Daniel P. Berrange suggested:
In the 'out of disk space' scenario you wouldn't need to save the guest - merely stop its CPU. This gives the host admin the opportunity to hot-add storage to the host & then resume execution, or to kill the VM, or to free enough space to save the VM to disk / live migrate it to another host.
Jerone Young asked, and Paul Brook replied:
> The recent TCG code to replace dyngen code in qemu cvs has broken > PowerPC host support (or from what is appears...anyone else who is not > x86 or x86-64). Is anyone working to fix this ? Is there a plan to fix > all the other hosts? As far as plans go, I expect they'll get fixed when someone implements them. Someone posed (incomplete) sparc host support fairly recently. Your best bet is probably just to follow this list, and post here when you start working on a particular target.
When Hollis Blanchard objected, Paul clarified:
> I'm not really familiar with the qemu development process; is this > usually how it works? People are free to break things and assume others > will fix it? Not really. However this is fairly exceptional circumstances. The gcc3 dependency means it's getting harder and harder to build qemu at all.
Following up on last week, Blue Swirl posted an updated Sparc support patch for TCG, making basic 32-bit Sparc system emulation work under TCG.
Takashi Yoshiji upgraded sh4 application emulation to run his test programs, upgrading over a dozen instructions:
Several instructions for sh4 need to be fixed/added. Some fixes and workarounds are needed in other part of qemu, too. # Some seems to be too dirty to be merged, though. Anyway, with a patch attached, {op,translate}.c are functional enough to run userland. Please apply.
He followed up with an sh4 fix to the pipe syscall, and then an architecture-neutral getgroups fix (which was corrected by Kirill A. Shutemov).
Ian Jackson submitted a patch, originally written by Rik van Riel, to make flushing the emulated ATA disk's write cache call fsync() on the host.
Jamie Lokier described the patch as "a very sensible improvement", but pointed out that under current 2.6 Linux kernels, sync only writes information into the disk's built-in cache but doesn't wait for the hardware to flush the disk's memory to the actual platter, so the data could still be lost in the case of a sudden power failure. This was concluded to be a kernel bug, and the thread was moved to the linux-kernel mailing list.
Commit 3898 is an interesting patch even for those who don't like sparc, because it shows how to convert an architecture (host and target) from dyngen to TCG.
3985, 3988, and 3992 are bugfixes, optimizations, upgrades, and cleanups to the TCG infrastructure. The rest are scattered bugfixes for curses, arm, and cris.