Mercurial > hg > toybox
annotate www/design.html @ 644:6a096902309d
Add dos2unix/unix2dos, remove old wrapper versions.
author | Rob Landley <rob@landley.net> |
---|---|
date | Mon, 30 Jul 2012 01:48:28 -0500 |
parents | e6acd7fbbfee |
children | 786841fdb1e0 |
rev | line source |
---|---|
365
8f0b24cc7cd7
Minor web page updates (put header/footer back, add a few <hr> tags).
Rob Landley <rob@landley.net>
parents:
200
diff
changeset
|
1 <!--#include file="header.html" --> |
8f0b24cc7cd7
Minor web page updates (put header/footer back, add a few <hr> tags).
Rob Landley <rob@landley.net>
parents:
200
diff
changeset
|
2 |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
3 <b><h2>Design goals</h2></b> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
4 |
529 | 5 <p>Toybox should be simple, small, fast, and full featured. Often, these |
6 things need to be balanced off against each other. In general, keeping the | |
7 code simple the most important (and hardest) goal, and small is slightly more | |
8 important than fast. Features are the reason we write code in the first | |
9 place but this has all been implemented before so if we can't do a better | |
10 job why bother? It should be possible to get 80% of the way to each goal | |
11 before they really start to fight.</p> | |
12 | |
13 <p>Here they are in reverse order of importance:</p> | |
14 | |
15 <b><h3>Features</h3></b> | |
16 | |
17 <p>The <a href=roadmap.html>roadmap</a> has the list of features we're | |
18 trying to implement, and the reasons for them. After the 1.0 release | |
19 some of that material may get moved here.</p> | |
20 | |
21 <p>Some things are simply outside the scope of the project: even though | |
22 posix defines commands for compiling and linking, we're not going to include | |
23 a compiler or linker (and support for a potentially infinite number of hardware | |
24 targets). And until somebody comes up with a ~30k ssh implementation, we're | |
25 going to point you at dropbear or polarssl.</p> | |
26 | |
27 <p>Environmental dependencies are a type of complexity, so needing other | |
28 packages to build or run is a big downside. For example, we don't use curses | |
29 when we can simply output ansi escape sequences and trust all terminal | |
30 programs written in the past 30 years to be able to support them. (A common | |
31 use case is to download a statically linked toybox binary to an arbitrary | |
32 Linux system, and use it in an otherwise unknown environment; being | |
33 self-contained helps support this.)</p> | |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
34 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
35 <b><h3>Fast</h3></b> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
36 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
37 <p>It's easy to say lots about optimizing for speed (which is why this section |
529 | 38 is so long), but at the same time it's the optimization we care the least about. |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
39 The essence of speed is being as efficient as possible, which means doing as |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
40 little work as possible. A design that's small and simple gets you 90% of the |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
41 way there, and most of the rest is either fine-tuning or more trouble than |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
42 it's worth (and often actually counterproductive). Still, here's some |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
43 advice:</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
44 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
45 <p>First, understand the darn problem you're trying to solve. You'd think |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
46 I wouldn't have to say this, but I do. Trying to find a faster sorting |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
47 algorithm is no substitute for figuring out a way to skip the sorting step |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
48 entirely. The fastest way to do anything is not to have to do it at all, |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
49 and _all_ optimization boils down to avoiding unnecessary work.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
50 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
51 <p>Speed is easy to measure; there are dozens of profiling tools for Linux |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
52 (although personally I find the "time" command a good starting place). |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
53 Don't waste too much time trying to optimize something you can't measure, |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
54 and there's no much point speeding up things you don't spend much time doing |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
55 anyway.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
56 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
57 <p>Understand the difference between throughput and latency. Faster |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
58 processors improve throughput, but don't always do much for latency. |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
59 After 30 years of Moore's Law, most of the remaining problems are latency, |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
60 not throughput. (There are of course a few exceptions, like data compression |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
61 code, encryption, rsync...) Worry about throughput inside long-running |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
62 loops, and worry about latency everywhere else. (And don't worry too much |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
63 about avoiding system calls or function calls or anything else in the name |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
64 of speed unless you are in the middle of a tight loop that's you've already |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
65 proven isn't running fast enough.)</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
66 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
67 <p>"Locality of reference" is generally nice, in all sorts of contexts. |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
68 It's obvious that waiting for disk access is 1000x slower than doing stuff in |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
69 RAM (and making the disk seek is 10x slower than sequential reads/writes), |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
70 but it's just as true that a loop which stays in L1 cache is many times faster |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
71 than a loop that has to wait for a DRAM fetch on each iteration. Don't worry |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
72 about whether "&" is faster than "%" until your executable loop stays in L1 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
73 cache and the data access is fetching cache lines intelligently. (To |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
74 understand DRAM, L1, and L2 cache, read Hannibal's marvelous ram guid at Ars |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
75 Technica: |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
76 <a href=http://arstechnica.com/paedia/r/ram_guide/ram_guide.part1-2.html>part one</a>, |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
77 <a href=http://arstechnica.com/paedia/r/ram_guide/ram_guide.part2-1.html>part two</a>, |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
78 <a href=http://arstechnica.com/paedia/r/ram_guide/ram_guide.part3-1.html>part three</a>, |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
79 plus this |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
80 <a href=http://arstechnica.com/articles/paedia/cpu/caching.ars/1>article on |
156
1e8f4b05cb65
Remove trailing whitespace (thanks to Charlie Shepherd), and a couple comment
Rob Landley <rob@landley.net>
parents:
117
diff
changeset
|
81 cacheing</a>, and this one on |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
82 <a href=http://arstechnica.com/articles/paedia/cpu/bandwidth-latency.ars>bandwidth |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
83 and latency</a>. |
117
07d8795fc19c
Link to ars technica paedia broke because ars is now using Windows 2003 on
Rob Landley <rob@landley.net>
parents:
114
diff
changeset
|
84 And there's <a href=http://arstechnica.com/paedia/index.html>more where that came from</a>.) |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
85 Running out of L1 cache can execute one instruction per clock cycle, going |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
86 to L2 cache costs a dozen or so clock cycles, and waiting for a worst case dram |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
87 fetch (round trip latency with a bank switch) can cost thousands of |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
88 clock cycles. (Historically, this disparity has gotten worse with time, |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
89 just like the speed hit for swapping to disk. These days, a _big_ L1 cache |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
90 is 128k and a big L2 cache is a couple of megabytes. A cheap low-power |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
91 embedded processor may have 8k of L1 cache and no L2.)</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
92 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
93 <p>Learn how virtual memory and memory managment units work. Don't touch |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
94 memory you don't have to. Even just reading memory evicts stuff from L1 and L2 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
95 cache, which may have to be read back in later. Writing memory can force the |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
96 operating system to break copy-on-write, which allocates more memory. (The |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
97 memory returned by malloc() is only a virtual allocation, filled with lots of |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
98 copy-on-write mappings of the zero page. Actual physical pages get allocated |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
99 when the copy-on-write gets broken by writing to the virtual page. This |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
100 is why checking the return value of malloc() isn't very useful anymore, it |
156
1e8f4b05cb65
Remove trailing whitespace (thanks to Charlie Shepherd), and a couple comment
Rob Landley <rob@landley.net>
parents:
117
diff
changeset
|
101 only detects running out of virtual memory, not physical memory. Unless |
1e8f4b05cb65
Remove trailing whitespace (thanks to Charlie Shepherd), and a couple comment
Rob Landley <rob@landley.net>
parents:
117
diff
changeset
|
102 you're using a NOMMU system, where all bets are off.)</p> |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
103 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
104 <p>Don't think that just because you don't have a swap file the system can't |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
105 start swap thrashing: any file backed page (ala mmap) can be evicted, and |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
106 there's a reason all running programs require an executable file (they're |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
107 mmaped, and can be flushed back to disk when memory is short). And long |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
108 before that, disk cache gets reclaimed and has to be read back in. When the |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
109 operating system really can't free up any more pages it triggers the out of |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
110 memory killer to free up pages by killing processes (the alternative is the |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
111 entire OS freezing solid). Modern operating systems seldom run out of |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
112 memory gracefully.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
113 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
114 <p>Also, it's better to be simple than clever. Many people think that mmap() |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
115 is faster than read() because it avoids a copy, but twiddling with the memory |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
116 management is itself slow, and can cause unnecessary CPU cache flushes. And |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
117 if a read faults in dozens of pages sequentially, but your mmap iterates |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
118 backwards through a file (causing lots of seeks, each of which your program |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
119 blocks waiting for), the read can be many times faster. On the other hand, the |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
120 mmap can sometimes use less memory, since the memory provided by mmap |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
121 comes from the page cache (allocated anyway), and it can be faster if you're |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
122 doing a lot of different updates to the same area. The moral? Measure, then |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
123 try to speed things up, and measure again to confirm it actually _did_ speed |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
124 things up rather than made them worse. (And understanding what's really going |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
125 on underneath is a big help to making it happen faster.)</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
126 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
127 <p>In general, being simple is better than being clever. Optimization |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
128 strategies change with time. For example, decades ago precalculating a table |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
129 of results (for things like isdigit() or cosine(int degrees)) was clearly |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
130 faster because processors were so slow. Then processors got faster and grew |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
131 math coprocessors, and calculating the value each time became faster than |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
132 the table lookup (because the calculation fit in L1 cache but the lookup |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
133 had to go out to DRAM). Then cache sizes got bigger (the Pentium M has |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
134 2 megabytes of L2 cache) and the table fit in cache, so the table became |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
135 fast again... Predicting how changes in hardware will affect your algorithm |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
136 is difficult, and using ten year old optimization advice and produce |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
137 laughably bad results. But being simple and efficient is always going to |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
138 give at least a reasonable result.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
139 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
140 <p>The famous quote from Ken Thompson, "When in doubt, use brute force", |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
141 applies to toybox. Do the simple thing first, do as little of it as possible, |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
142 and make sure it's right. You can always speed it up later.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
143 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
144 <b><h3>Small</h3></b> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
145 <p>Again, simple gives you most of this. An algorithm that does less work |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
146 is generally smaller. Understand the problem, treat size as a cost, and |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
147 get a good bang for the byte.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
148 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
149 <p>Understand the difference between binary size, heap size, and stack size. |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
150 Your binary is the executable file on disk, your heap is where malloc() memory |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
151 lives, and your stack is where local variables (and function call return |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
152 addresses) live. Optimizing for binary size is generally good: executing |
156
1e8f4b05cb65
Remove trailing whitespace (thanks to Charlie Shepherd), and a couple comment
Rob Landley <rob@landley.net>
parents:
117
diff
changeset
|
153 fewer instructions makes your program run faster (and fits more of it in |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
154 cache). On embedded systems, binary size is especially precious because |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
155 flash is expensive (and its successor, MRAM, even more so). Small stack size |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
156 is important for nommu systems because they have to preallocate their stack |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
157 and can't make it bigger via page fault. And everybody likes a small heap.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
158 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
159 <p>Measure the right things. Especially with modern optimizers, expecting |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
160 something to be smaller is no guarantee it will be after the compiler's done |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
161 with it. Binary size isn't the most accurate indicator of the impact of a |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
162 given change, because lots of things get combined and rounded during |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
163 compilation and linking. Matt Mackall's bloat-o-meter is a python script |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
164 which compares two versions of a program, and shows size changes in each |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
165 symbol (using the "nm" command behind the scenes). To use this, run |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
166 "make baseline" to build a baseline version to compare against, and |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
167 then "make bloatometer" to compare that baseline version against the current |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
168 code.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
169 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
170 <p>Avoid special cases. Whenever you see similar chunks of code in more than |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
171 one place, it might be possible to combine them and have the users call shared |
529 | 172 code. (This is the most commonly cited trick, which doesn't make it easy. If |
173 seeing two lines of code do the same thing makes you slightly uncomfortable, | |
174 you've got the right mindset.)</p> | |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
175 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
176 <p>Some specific advice: Using a char in place of an int when doing math |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
177 produces significantly larger code on some platforms (notably arm), |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
178 because each time the compiler has to emit code to convert it to int, do the |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
179 math, and convert it back. Bitfields have this problem on most platforms. |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
180 Because of this, using char to index a for() loop is probably not a net win, |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
181 although using char (or a bitfield) to store a value in a structure that's |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
182 repeated hundreds of times can be a good tradeoff of binary size for heap |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
183 space.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
184 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
185 <b><h3>Simple</h3></b> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
186 |
403
f6ffc6685a9e
Fluff out documentation and skeleton code.
Rob Landley <rob@landley.net>
parents:
365
diff
changeset
|
187 <p>Complexity is a cost, just like code size or runtime speed. Treat it as |
f6ffc6685a9e
Fluff out documentation and skeleton code.
Rob Landley <rob@landley.net>
parents:
365
diff
changeset
|
188 a cost, and spend your complexity budget wisely. (Sometimes this means you |
f6ffc6685a9e
Fluff out documentation and skeleton code.
Rob Landley <rob@landley.net>
parents:
365
diff
changeset
|
189 can't afford a feature because it complicates the code too much to be |
f6ffc6685a9e
Fluff out documentation and skeleton code.
Rob Landley <rob@landley.net>
parents:
365
diff
changeset
|
190 worth it.)</p> |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
191 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
192 <p>Simplicity has lots of benefits. Simple code is easy to maintain, easy to |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
193 port to new processors, easy to audit for security holes, and easy to |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
194 understand. (Comments help, but they're no substitute for simple code.)</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
195 |
529 | 196 <p>Prioritizing simplicity tends to serve our other goals: simplifying code |
197 generally reduces its size (both in terms of binary size and runtime memory | |
198 usage), and avoiding unnecessary work makes code run faster. Smaller code | |
199 also tends to run faster on modern hardware due to CPU cacheing: fitting your | |
200 code into L1 cache is great, and staying in L2 cache is still pretty good.</p> | |
201 | |
202 | |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
203 <p><a href=http://www.joelonsoftware.com/articles/fog0000000069.html>Joel |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
204 Spolsky argues against throwing code out and starting over</a>, and he has |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
205 good points: an existing debugged codebase contains a huge amount of baked |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
206 in knowledge about strange real-world use cases that the designers didn't |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
207 know about until users hit the bugs, and most of this knowledge is never |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
208 explicitly stated anywhere except in the source code.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
209 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
210 <p>That said, the Mythical Man-Month's "build one to throw away" advice points |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
211 out that until you've solved the problem you don't properly understand it, and |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
212 about the time you finish your first version is when you've finally figured |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
213 out what you _should_ have done. (The corrolary is that if you build one |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
214 expecting to throw it away, you'll actually wind up throwing away two. You |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
215 don't understand the problem until you _have_ solved it.)</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
216 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
217 <p>Joel is talking about what closed source software can afford to do: Code |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
218 that works and has been paid for is a corporate asset not lightly abandoned. |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
219 Open source software can afford to re-implement code that works, over and |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
220 over from scratch, for incremental gains. Before toybox, the unix command line |
200 | 221 has already been reimplemented from scratch several times in a row (the |
222 original AT&T Unix command line in assembly and then in C, the BSD | |
223 versions, the GNU tools, BusyBox...) but maybe toybox can do a better job. :)</p> | |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
224 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
225 <p>P.S. How could I resist linking to an article about |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
226 <a href=http://blog.outer-court.com/archive/2005-08-24-n14.html>why |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
227 programmers should strive to be lazy and dumb</a>?</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
228 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
229 <b><h2>Portability issues</h2></b> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
230 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
231 <b><h3>Platforms</h3></b> |
529 | 232 <p>Toybox should run on Android (alas, with bionic), and every other hardware |
233 platform Linux runs on. Other posix/susv4 environments (perhaps MacOS X or | |
234 newlib+libgloss) are vaguely interesting but only if they're easy to support; | |
235 I'm not going to spend much effort on them.</p> | |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
236 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
237 <p>I don't do windows.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
238 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
239 <b><h3>32/64 bit</h3></b> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
240 <p>Toybox should work on both 32 bit and 64 bit systems. By the end of 2008 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
241 64 bit hardware will be the new desktop standard, but 32 bit hardware will |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
242 continue to be important in embedded devices for years to come.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
243 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
244 <p>Toybox relies on the fact that on any Unix-like platform, pointer and long |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
245 are always the same size (on both 32 and 64 bit). Pointer and int are _not_ |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
246 the same size on 64 bit systems, but pointer and long are.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
247 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
248 <p>This is guaranteed by the LP64 memory model, a Unix standard (which Linux |
114
ce6956dfc0cf
Add sync and an incomplete version of mdev.
Rob Landley <rob@landley.net>
parents:
24
diff
changeset
|
249 and MacOS X both implement). See |
24
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
250 <a href=http://www.unix.org/whitepapers/64bit.html>the LP64 standard</a> and |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
251 <a href=http://www.unix.org/version2/whatsnew/lp64_wp.html>the LP64 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
252 rationale</a> for details.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
253 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
254 <p>Note that Windows doesn't work like this, and I don't care. |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
255 <a href=http://blogs.msdn.com/oldnewthing/archive/2005/01/31/363790.aspx>The |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
256 insane legacy reasons why this is broken on Windows are explained here.</a></p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
257 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
258 <b><h3>Signedness of char</h3></b> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
259 <p>On platforms like x86, variables of type char default to unsigned. On |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
260 platforms like arm, char defaults to signed. This difference can lead to |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
261 subtle portability bugs, and to avoid them we specify which one we want by |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
262 feeding the compiler -funsigned-char.</p> |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
263 |
c8d0f1876c40
Web site updates, and a design document.
Rob Landley <rob@landley.net>
parents:
diff
changeset
|
264 <p>The reason to pick "unsigned" is that way we're 8-bit clean by default.</p> |
365
8f0b24cc7cd7
Minor web page updates (put header/footer back, add a few <hr> tags).
Rob Landley <rob@landley.net>
parents:
200
diff
changeset
|
265 |
616
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
266 <p><h3>Error messages and internationalization:</h3></p> |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
267 <p>Error messages are extremely terse not just to save bytes, but because we |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
268 don't use any sort of _("string") translation infrastructure.</p> |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
269 |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
270 <p>Thus "bad -A '%c'" is |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
271 preferable to "Unrecognized address base '%c'", because a non-english speaker |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
272 can see that -A was the problem, and with a ~20 word english vocabulary is |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
273 more likely to know (or guess) "bad" than the longer message.</p> |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
274 |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
275 <p>The help text might someday have translated versions, and strerror() |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
276 messages produced by perror_exit() and friends can be expected to be |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
277 localized by libc. Our error functions also prepend the command name, |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
278 which non-english speakers can presumably recognize already.</p> |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
279 |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
280 <p>An enventual goal is UTF-8 support, although it isn't a priority for the |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
281 first pass of each command. (All commands should at least be 8-bit clean.)</p> |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
282 |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
283 <p>Locale support isn't currently a goal; that's a presentation layer issue, |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
284 X11 or Dalvik's problem.</p> |
e6acd7fbbfee
A note on error messages and internationalization.
Rob Landley <rob@landley.net>
parents:
529
diff
changeset
|
285 |
365
8f0b24cc7cd7
Minor web page updates (put header/footer back, add a few <hr> tags).
Rob Landley <rob@landley.net>
parents:
200
diff
changeset
|
286 <!--#include file="footer.html" --> |