Summary of DEC 32-bit machine.

DMR, from notes by JFO [Joe Ossanna]


(DEC confidential--subject to non-disclosure agreement)
[presumably the agreement has expired by now!]

The project is called `VAX'-- Virtual Address Extension. The first hardware is called `STAR' (unoriginal name!) and the operating system STARLET. Its speed, in native mode, is "between 1 and 2 times the 11/70."

[It wasn't. On programs that didn't need much memory an 11/70 was noticeably faster.]
The speed emulating an 11/70 in user mode is about that of an 11/70. The cost is intended to be comparable to an 11/70.
[It was considerably more, actually.]
We could get a machine on a field test basis toward the end of 1977.
[We didn't, in our group; another research group at Bell Labs did, and produced Unix 32/V, direct predecessor to the UC Berkeley distributions]
I don't know when regular deliveries are scheduled. They are now "past the breadboard stage;" which seems to mean at least that they have at least one machine electrically, but not mechanically, the same as the final version. I gather that a "field test" machine is free but of course it is likely to be used for training FE's and would not be our own.

Instruction set architecture.

The machine is byte-addressed, with a 32-bit virtual address. It handles the following data formats:
[Here I delete a long section describing data formats, address modes, and instructions. It is pretty much correct; Joe must have taken excellent notes.]
Calls. The machine has a built-in calling sequence. I'll try to reproduce it exactly. Briefly, though, it appears to be possible to do just what C wants. I'll try to make clear just what the hardware does do that it can be checked.
[ Here there is a long, essentially correct, description of 'calls' and 'callg' and how to use them--access arguments, allocate and refer to locals, and so forth.]
As a side note, SCJ [Steve Johnson] with some advice from me has just written a description of what C wants from a calling sequence and what it is forced to take on some machines. So far as I can determine, this organization embodies every desirable feature that was imagined by us and several more besides. I am astonished at how well it is designed, particularly considering that this is the same company that gave us the `mark' instruction.
[1999 addition: although it's not remarked upon in the 1988 netnews posting, my gushing admiration of the VAX calling instructions is one of the things most attackable from the RISC perspective, and in fact we and others discovered that even on VAX it was possible to call faster with the simpler instructions.]


There are lot more miscellaneous instructions.
[things like insque and find-first-set; I omit.]
Memory mapping and system features.

This area is rather complicated and somewhat less nice. The virtual address is 32 bits (maybe it was really 31, but it hardly matters). The high order bit selects "system" or "program" space; this has no protection implications, but does help determine the style of mapping. The next bit selects "program 0" or "program 1" if the "system" bit is off. "System" plus the "program 1" bit is undefined and reserved. The machine is paged, but not segmented, except that the three legal states of the program bit with the system bit select one of three page tables. The page size is XX bytes.
[Note that either we missed hearing this or they didn't say!]
Suppose an address lies in system space. Then the YY bits below the S and P bits are used to look up in a system page table; its base is stored in a hardware register and there is a limit. The page table word (discussed more below) gives the physical address. The system page table lies on a physical page boundary.

If the address is in program space, the page number is looked up in either the p0 or p1 page tables. The base and limits of both of these are in hardware registers, however the base is not a physical address but is mapped according to the system address space. Incidentally, the P1 page table goes backwards in memory. One thinks of a P1 address as a moderately small 31-bit negative number.

The page table word ultimately accessed has a present bit, 4 bits (15 states) of protection information, and a physical address. I don't know the size of the bit bield, but it is generous compared to the 2MB of memory that can be attached to the machine at the moment. There is a "modified" bit but no "accessed" bit.

The machine is designed for virtual memory. Any instruction can be restarted. They don't promise that if you look at the detailed state of things when a page-fail interrupt occurs you will see anything interesting; just that you get the virtual address of the failing reference, and that the instruction can be restarted from the beginning with the right results. The implication is, that things work right, but that all pages referenced by an instruction must be in core for the whole instruction. You can't step through a piece at a time. Thus there is theoretically a minimum set of pages that have to be present and it is not entirely trivial (perhaps as big as 20) for some of the odder instructions.

There are four protection domains, something like kernel, executive, supervisor, user. The latter three cannot execute privileged instructions and in general they claim attempts have been made to prevent a less privileged domain from interfering with a more privileged. The 15 states in a page table word somehow encode a nested set of access rights to the page. This must be some subset of the cross product (read, write [,execute?])X(k,e,s,u). I don't know the details. One hopes it is sensible.

Critique

The design of the user-available instruction set is is one of the most attractive I have ever seen. We could not investigate all the nooks and crannies, but it appears to be extremely regular in its treatment of both operators and operands; this tends to make a compiler's code generator simple (and thus more nearly able to approach optimality). DEC claims that despite the doubled number of bits in the virtual address space, the size in bytes of programs should approach that of the 11. I intend to investigate this with C outputs, but I am inclined to accept the claim. The architecture loses bits in most address modes (which occupy at least one byte, and sometimes several more), but gains in being able to express small displacements from registers and small literals. For example, to load a small constant, or a value at a small displacement from a register, takes three bytes on VAX and four on the 11.

Some care will be needed to produce programs in which all the addresses have minimal length. Fortunately, the same techniques which we use on the Interdata remain applicable.

The memory mapping is not so good, mainly because it does not seem easy to use the very large virtual address space. If information is placed at random the page tables become huge (2^21 words!). However, the user page tables can themselves be paged, and this may provide an out. I asked Steve Rothman why they did not go to a segmented scheme, and the reply was that the overhead (presumably on address-cache misses) seemed too large. I should have investigated this further, because I don't believe it. He may have had in mind segmentation combined with full mapping of the user addressing tables. This might indeed be pretty messy.

They talked some about software. It was rather depressing. Most of it will be emulated. (Presumably in a 2MB machine you will still have to tell the assembler how big a symbol table to use.) The system itself will be new, but unimaginative. They did not seem to understand, for example, why or even how the command interpreter should be a separate process and not in the system, and why commands themselves should be processes. They are also still stuck mostly in assembly language. There are companies that are learning about how to write software, but DEC is evidently not one of them.

My general impression is that this is a remarkably good machine. DEC talked about lots of other features, such as the physical design, self-checking, and subset isolation; at least they were soothing to hear. It sounded pretty good, but it's hard to know how it will work out in practice.